Friday, December 9, 2011

99 bottles

Here's my version of this classic:
USING: combinators io kernel math
math.parser quotations sequences ;
IN: bottles

: bottle(s) ( n -- s ) 1 = "bottle " " bottles " ? ;

: (line) ( bottle(s) -- string )
dup {
{ 1 [ bottle(s) "1 " ] }
{ 0 [ bottle(s) "No more" ] }
[ bottle(s) swap number>string ]
} case swap "of beer" 3append ;

: (line-1) ( bottle(s) -- str )
(line) " on the wall" append ;

: line-1 ( bottle(s) -- )
(line-1) "," append print ;

: line-2 ( bottle(s) -- )
(line) "." append print ;

: line-3 ( bottle(s) -- )
0 = "Go to the store, buy some more,"
"Take one down, pass it around," ? print ;

: line-4 ( bottle(s) -- )
1 - dup 0 < [ drop 99 ] when
(line-1) "!\n" append print ;

: sing-lines ( x seq -- )
[ 1quotation ] map [ call( x -- ) ] with each ;

: sing-verse ( bottle(s) -- )
{ line-1 line-2 line-3 line-4 } sing-lines ;

: sing-song ( -- )
100 iota <reversed> [ sing-verse ] each ;

MAIN: sing-song

Thursday, December 8, 2011

Step-debugging parsing words

With all the macro play posts, I would have gone insane by now if I had had been juggling the stack in my head.

So I wrote the equivalent of break but for parsing words. The reason why simply using break (or its always-available alias B) doesn't work for parsing words is because the walker expands them before you have a chance to step into them.
So here's B's brother, B: :
USING: definitions kernel parser tools.continuations ;
IN: syntax-break

SYNTAX: B: scan-word definition
[ break "now press O I" drop ]
prepose call( accum -- accum ) ;

Its definition is simple: it scans the next word, gets its definition (aka source code) as a quotation, prepends break to it, and then calls it. That's exactly what I was doing manually before I wrote this. :P

As a bonus and to be self-explanatory, I added a little "live" documentation to it (they're instructions, actually) so that I don't forget how to use it.

Oh, and I have to use call( instead of call because the quotation I'm calling isn't know at compile time.

Try it out:
IN: scratchpad "world" "hello" B: [| w h | h ", " w 3append ] call .

And then follow the instructions (hint: you'll land inside B:'s definition still, right at the "now press O I" drop part. Guess what happens when you press O and then I.)

Indexed stack shufflers - part 4 (final)

Let's finish this chapter at once.
Here's a less flippant implementation of the stack shufflers we've been discussing.
USING: fry locals.parser locals.types math.parser math.ranges
multiline namespaces sequences ;
IN: shortlocals

: >n-locals ( n -- seq ) [1,b] [ number>string ] map ;

: define-n-locals ( accum n -- accum )
>n-locals '[ _ [ make-local ] map ] H{ } make-assoc ! words hash
(parse-lambda) <lambda> ?rewrite-closures append! ;

SYNTAX: [2> 2 define-n-locals ;
SYNTAX: [3> 3 define-n-locals ;

And here's how it works (notice how it plays nice with other local-introducing parse-words):
IN: scratchpad 2 9 [2> 2 1 / ] call .
4+1/2
IN: scratchpad 2 9 [2> 2 1 / 3 + ] call .
7+1/2
IN: scratchpad 2 9 100 [3> 2 1 / 3 + ] call .
104+1/2
IN: scratchpad 2 9 100 [3> 3 [| a | a 1 / ] call 2 1 3array ] call .
{ 50 9 2 }

define-n-locals is defined exactly as [| is, except we skip the first step inside parse-lambda (i.e. we skip parse-local-defs and instead simply provide the sequence of strings to build locals from.

I also consulted with the Factor Gods and they told me it's impossible to write a version of [2> that simply adds " [| 1 2 | " to the code (which would be sort of the equivalent of C-style "macros", I think).

To which I replied that it's impossible except for one very dirty way of doing it. I'll save that for a future post, when I'm feeling flippant again.

Sunday, December 4, 2011

Indexed stack shufflers - part 3

Now that I've covered some background of how parsing works in general and superficially covered locals, I can explain how the above stack shuffler works. Here's a more cleaned-up version of the code followed by a numbered explanation of what it does (the numbers on the list refer to the numbers in the code, of course):

! 3 -> { "arg-1" "arg-2" "arg-3" }
: >params-n ( n -- vec )
1 + iota unclip drop [ number>string "arg-" prepend ] map ;

GENERIC: >number ( str -- n )
M: string >number string>number ;

: str-is-int? ( str -- ? ) >number integer? ;

SYNTAX: [n>
[let 0 :> largest!
"]" [ dup str-is-int? ! 3. (below)
[ dup string>number largest max largest! "arg-" prepend ]
[ ] if ! 2.
] map-tokens ! 1.
largest ! -- tokens: { "arg-2" "arg-1" "/" } 2
] >params-n ! 4. -- tokens { "arg-1" "arg-2" }
'[ [ _ [ make-local ] map ] H{ } make-assoc ] ! 5. -- tokens [ quot ]
call rot ! 5. -- vars: { arg-1 arg-2 }
! assoc: H{ { "arg-1" arg-1 } { "arg-2" arg-2 } }
! tokens
'[ _ [ parse-datum ] map >quotation ] ! 6. -- vars assoc [ quot ]
((parse-lambda)) ! 6. -- vars [ arg-2 arg-2 / ]
<lambda> ! 7. -- [| arg-1 arg-2 | arg-2 arg-1 ]
?rewrite-closures append! ; ! 8.

Target: [n> 2 1 / ]

  1. Read all tokens until "]"

  2. For each token, if it's an integer n then covert it to "arg-${n}"

  3. While doing step 2, also keep track of the largest number seen

  4. Generate a list of n "arg-${n}" strings where n is the largest number seen

  5. Create a new blank namespace (an assoc) and fill it with words named as the "arg-n" strings

  6. Parse the tokens:

    1. Push assoc to manifest (it becomes the first namespace in the lookup path)
    2. Parse the tokens
    3. Convert the list of parsed tokens into a quotation

  7. Wrap variables and quotation into a lambda tuple

  8. Rewrite lambda so that every reference to a variable becomes an interaction with the retain stack; add the resulting quotation to the parse tree.

Here's an example usage:
2 9 [n> 2 1 / ] call . ! => 4+1/2
Hooray!

Now here is a way to break it:
2 9 [n> 2 1 [| a b | b a / ] ] call . ! => should return 2/9
Awww!

The reason why the line above breaks our word is because we're parsing the tokens too naively. The specific reason it breaks is because it matches on the first "]" which is not its own. There are several problems with our approach, all stemming from the fact that we need to read ahead to know the largest number before we can actually begin parsing anything, but reading raw tokens isn't adequate since we stop at the first "]" and not the matching \ ] .

What we'd need to do to fix this is to actually parse until \ ] (the word parse-until also executes all parsing words it comes across, so we'd have one less thing to worry about), then find the largest number, then reset the lexer head to where we started, then parse again - but then when would we be able to change the numbers into arg-1 arg-2 etc? And even worse, this would execute parsing words twice, which can be a bad thing. Even if this worked, it would only work with numbers because they always parse, but what if we wanted [x> y x / ] ? We'd have no way to write that because the first parse would fail.

No, we need a better way. Parsing until \ ] won't do because of the reasons above, so we need to go to a lower level. Maybe we have to go down to reading tokens, but the inherent problem with that is we always stop at the first "]". We need to go to a lower level still.

But I'm not convinced this problem is a low-level one. Let's think again from another perspective. What would be the ideal scenario? Well, I'd really like to use parse-until since it executes the parsing words for me, and thus handles the balancing of the brackets automatically. Why can't I use it, again? Ah, because it parses tokens and throws an error if a word is not found; and because there's no room to turn numbers into arg-n words (or to otherwise turn non-words into words).

(a couple of hours later)

I'm starting to think this is impossible for the general case, and very time-consuming for the static case where we know the exact syntax of all parsing words that introduce and shadow local variables. The easy way around is to add a restriction by forward declaration:
2 9 [2> 2 1 / ] call . ! => would return 4+1/2 if implemented
I'll write the code for that later.

Saturday, December 3, 2011

Indexed stack shufflers - part 2

Here's the explanation I promised. First we need an understanding of what we're talking about: a parser word (equivalent of Common Lisp's reader macros) that allows me to reference the stack by index. The indexing can be whatever you want, but I have written two as examples; the first and my favorite is the "Natural" indexing where the TOS (top-of-stack) is n, where n is the arity of the quotation being built. That's right, this means this indexing is 1-based:

2 8 [ 2 1 / ] call ! => 4

The 1 inside the quotation means element <1> on the stack, and 2 means element <2>, i.e. the TOS - because the quotation is of arity 2. Maybe this will help:

foo <1> <2> <3> [ quotation of arity 3 ]
foo bar <1> <2> [ quotation of arity 2 ]
foo bar quux <1> [ quotation of arity 1 ]

The numbers above stand for which element on the stack those numbers will refer to inside the quotation. foo, bar and quux are other stack objects that aren't used by the quotation.

The second indexing I wrote was the one suggested on #concatenative, where TOS is indexed as 0 and the second object is 1, etc. I find that indexing hard to think with.

So the goal is to write a word like [ that allows the example line above to work. We'll create our own [ for each type of indexing: [n> for natural indexing (TOS = n, n = arity), and [0> for 0-based indexing (TOS = 0). My explanation will focus on [n> only, but it applies equally to [0> (the difference is factored out).

Basically, the effect we want to achieve is to turn [n> 2 1 ] into [| 1 2 | 2 1 ] .
Not only is this an accurate explanation of the effect that the [n> I wrote achieves (though it's not what it does), but that code is actually runnable:

2 9 [| 1 2 | 2 1 ] call 2array . ! => { 9 2 }

This is handy for me since I need to explain why this works before I explain [n> . It also means that we'll be looking at the definition of [| for inspiration (code thievery).

[| is a parser word - it takes control of the lexer. Parser words are defined with the SYNTAX: word and they can choose at what level they want to read ahead - the lowest level being one string and the highest being a sequence of parsed objects. In-between there's a way to read a sequence of strings, possibly modify them, and only then parse them. That's what [| does, and those strings are then called "raw tokens" because they haven't yet been parsed. As a way to make this clearer, this is how each of these 3 levels would see some piece of code:

! reading 2 dup "blah"

"2 dup \"blah\""
Lowest level - I used this to write HERELINE:
This is so low that the only built-in ways to read ahead are by char (column), by line or the whole source code. There's no nice way to say "oh, only read until you see a matching ]" - you'd have to read the whole source and balance the brackets yourself. Also, after you're done don't forget to move the lexer head to where you read the last char. So yes, this is as low as it gets.

{ "2" "dup" "\"blah\"" } - sequence of strings (raw tokens)
Intermediary level - [| uses this and we have to as well because we need the names of the local variables to setup some things so that when they're parsed, they won't throw an error saying those words can't be found.
example:
SYNTAX: %[ "]" parse-tokens suffix! ;
%[ 2 dup "blah" dup ] ! => { "2" "dup" "\"" "blah\"" }
(char 34 (the " character) is a special word (it can touch the next), that's why the above looks weird)

{ 2 dup "blah" }
Highest level - everything has been parsed and if a word doesn't exist, an error will be thrown.
example:
SYNTAX: %[ \ ] parse-until suffix! ;
%[ 2 dup "blah" ] ! => V{ 2 dup "blah" }
%[ 2 dupe "blah" ] ! => error: "dupe" is not a defined word

Again, we'll be using the middle level, which is what [| uses.

This is the [| does, in recipe form, on the line "[| 1 2 | 2 1 ]":

1 - read tokens until "|" is found => { "1" "2" }

2 - create a word for each token, whose role is to fetch the respective position on the retain stack (this is a "local")

3 - ah, but those words are created in a new namespace

4 - now take this namespace and make it be the first in line for when we lookup words

5 - now go ahead and PARSE until \ ] (you see the trick? now it can parse instead of tokenize since it read "1" and "2" and turned them into the words \ 1 and \ 2 which can now be found as words instead of numbers, since the namespace we created with the definition for \ 1 and \ 2 shadow even the fact that they're numbers (!). This is the time I realize I should be writing these examples with "a" and "b" so that it would be clearer now why it can't throw a "no word found" for \ a - the reason being that we would have just created an \ a word from "a" in a new namespace and have placed that new namespace as the first one to be looked up when parsing. But if I had used "a" and "b" I'd have no reason to be explaining all this. So there.)

6 - we're done! (handwavingly so)

Ok, now you understand how [| works. So why didn't I write my [n> word to simply output the equivalent [| code? Because I have no clue how to write a SYNTAX: word that builds code from other SYNTAX: words. So I took what I believe (hope?) was the long way around, but I learned some useful stuff on the way so it was a good thing.

To be continued...

Thursday, December 1, 2011

Indexed stack shufflers

Some 12 hours ago a guest on #concatenative shared this idea for a new syntax for stack shuffling:
"world" ", " "hello" "!" <1 2 3 0> 4append ! => "hello, world!"

I decided that would be a great challenge and an opportunity to level up my macro skills. Here it is:
USING: fry grouping kernel lexer locals locals.parser
locals.types make math math.order math.parser multiline
namespaces parser quotations sequences vocabs.parser words ;
IN: shortlocals

! 4 -> { "arg-3" "arg-2" "arg-1" "arg-0" }
: >params-0 ( n -- vec )
1 + iota reverse [ number>string "arg-" prepend ] map ;

! 3 -> { "arg-1" "arg-2" "arg-3" }
: >params-n ( n -- vec )
1 + iota unclip drop [ number>string "arg-" prepend ] map ;

! like \ supremum but for seqs with non-numbers intermixed
: max-num ( seq -- n )
[ number? ] filter supremum ;

SYNTAX: [0>
[let 0 :> largest!
"]"
[
! = modded parse-datum
dup string>number [ dup string>number largest max largest! "arg-" prepend ] [ ] if
] map-tokens
largest
] >params-0 ! -- words params
'[ [ _ [ make-local ] map ] H{ } make-assoc ] dip
'[ _ [ dup search [ ] [ dup string>number [ ] [ no-word ] ?if ] ?if ] map >quotation ]
((parse-lambda))
<lambda> ?rewrite-closures append! ;
! final version of 0[

SYNTAX: [n>
[let 0 :> largest!
"]"
[
! = modded parse-datum
dup string>number [ dup string>number largest max largest! "arg-" prepend ] [ ] if
] map-tokens
largest
] >params-n ! -- words params
'[ [ _ [ make-local ] map ] H{ } make-assoc ] dip
'[ _ [ dup search [ ] [ dup string>number [ ] [ no-word ] ?if ] ?if ] map >quotation ]
((parse-lambda))
<lambda> ?rewrite-closures append! ;



! top-of-stack is 0, below that is 1, and so forth
"world" ", " "hello" "!" [0> 1 2 3 0 3append append ] call . ! => "hello, world!"

! natural order: "world" is 1, the comma is 2, "hello" is 3 and "!" is 4:
"world" ", " "hello" "!" [n> 3 2 1 4 3append append ] call . ! => "hello, world!"

It's a bit of a mess still as I haven't refactored it yet, and as there's some code that I copied from other words' definitions instead of calling the word (because I was changing some bits but ended up mostly leaving the way it was and not undoing the mess).
I'll explain the code and what I learned from the experience tomorrow.

Saturday, November 26, 2011

One-line here-documents

Many of the popular languages nowadays offer here-documents as a solution to making a string out of an unescaped text. This works great for large chunks of text but it becomes too cluttered and verbose when one wants a one-line string; in all languages I know that offer here-documents, one would have to use at least two (three?) lines to write one line of unescaped text.

For example, here's a one-line heredoc in Ruby:

my_string = <<HERE.strip
This "," that
HERE

# my_string => "This \",\" that\n"

The .strip right after <<HERE is necessary to avoid a nagging trailing newline that ruby adds to every heredoc.

So as you can see, it takes three lines - one to start the heredoc, one for the text, one to end the heredoc.

I have often wanted to have a "here-line" function to be able to do just that. As far as I know that is something completely impossible to write in any of those popular languages we're talking about - they don't give you control over the lexer.

Factor does:

USING: lexer multiline.private namespaces sequences ;

: rest-of-line ( -- seq )
lexer get [ line-text>> ] [ column>> ] bi tail "" like ;

SYNTAX: HERELINE:
lexer get skip-blank
rest-of-line suffix!
lexer get next-line ;

! sample usage ( can't be used in the same file as the SYNTAX: line above
! -- if using scratchpad, copy&paste above first, then this separately )
HERELINE: This "," that
dup . print

"This \",\" that"
This "," that

Tuesday, November 15, 2011

Emacs keybindings

Emacs keybindings

I'm an Emacs guy - I finally converted after years of resisting it.
I can't function if my keybindings aren't available to me.
C-a, C-e and C-k are especially important, and their use seem to be somewhat common in the unix world.

The Factor listener, however, does not implement those.
In fact, out of the box it uses C-a for selecting all text (C-x h in emacs), C-e for Factor's "edit" command (writing + and pressing C-e in stock Factor is equivalent to writing \ + edit and pressing <RET>), and C-k for killing an entire line (emacs' C-a C-k) or the whole text (emacs' C-x h C-w), I can't remember which.

I had made a patch for fixing this severe oversight, and mrjbq7 was kind enough to apply a modified version of it. Life was good. But then Slava reverted the patch, and I went back to a dysfunctional listener. The reason the patch was reverted was because it does not provide a real solution (which would be to make keybindings customizable), and it breaks muscle memory for those who were used to Factor's keybindings and that are not emacs users. I think that's fair - I still remember how it felt to not be an emacs user - everything seemed so emacs-centric!
Now that I'm a convert nothing ever feels emacs-centric enough :).

Not to worry, though, because qx opened my eyes to the fact that I could override the stock keybinding settings using my ~/.factor-rc file.

So I thought I'd post my ~/.factor-rc file whole in hopes that it will help a fellow emacs user (or a regular unix command line user).

The only thing I'm missing is better word navigation using C-← and C-→ (currently it stops on every boundary and does not behave like emacs' M-f and M-b). I'll see if I can fix that later.

Here it is, my unadultered ~/.factor-rc file:

USING: definitions editors.emacs io.pathnames multiline
namespaces parser ui.commands ui.gadgets.editors ui.gestures
ui.operations ui.tools.listener vocabs.hierarchy vocabs.refresh
;
QUALIFIED: editors

! ------------------------------------------------
! to get a trimmed USING: vocabs list, do this:
! - start factor normally
! - edit this file and comment all lines above this comment
! - comment in the auto-use line below
! - on the open Factor listener, run "~/.factor-rc" run-file
! - you can't really type ~, so please pre-expand it
! - resolve ambiguities if any restarts are thrown
! - copy and paste the suggested USING: here
! - comment out the auto-use line
! - go to #concatenative and thank qx for his patience
! auto-use ---------------------------------------

! setup emacs to work from Factor
"/usr/local/bin/emacsclient" \ emacsclient-path set-global

! quick troubleshooting - if the above does not work:
! make sure this has been done once:
! - sudo ln -s /Applications/Emacs.app/Contents/MacOS/bin/emacsclient /usr/local/bin/emacsclient
! make sure this is included in ~/.emacs.d/init.el:
! - (server-start)

! ----------
! setup the following keybindings as emacs has them:
! C-a, C-e, C-k

! basis/ui/gadgets/editors/editors.factor
! this adds C-k, C-a, C-e
editor "caret-motion" f {
{ T{ button-down } position-caret }
{ T{ key-down f f "LEFT" } previous-character }
{ T{ key-down f f "RIGHT" } next-character }
{ T{ key-down f { C+ } "LEFT" } previous-word }
{ T{ key-down f { C+ } "RIGHT" } next-word }
{ T{ key-down f f "HOME" } start-of-line }
{ T{ key-down f f "END" } end-of-line }
{ T{ key-down f { C+ } "HOME" } start-of-document }
{ T{ key-down f { C+ } "END" } end-of-document }
{ T{ key-down f { C+ } "k" } delete-to-end-of-line } ! +
{ T{ key-down f { C+ } "a" } start-of-line } ! +
{ T{ key-down f { C+ } "e" } end-of-line } ! +
} define-command-map

! basis/ui/tools/listener/listener.factor
! this substracts a flawed C-k
interactor "interactor" f {
{ T{ key-down f f "RET" } evaluate-input }
! { T{ key-down f { C+ } "k" } clear-editor } ! -
} define-command-map

! basis/ui/tools/operations/operations.factor
! these replace C-e with C-b to invoke the "edit" command
[ pathname? ] \ editors:edit-file H{
! { +keyboard+ T{ key-down f { C+ } "e" } } ! -
{ +keyboard+ T{ key-down f { C+ } "b" } } ! +
{ +primary+ t }
{ +secondary+ t }
{ +listener+ t }
} define-operation
! ---
[ definition? ] \ editors:edit H{
! { +keyboard+ T{ key-down f { C+ } "e" } } ! -
{ +keyboard+ T{ key-down f { C+ } "b" } } ! +
{ +listener+ t }
} define-operation

! end keybindings
! ----------

! uncomment and run Factor on the command line to sort USING: vocabs
! << manifest get pprint-manifest >>

A simpler continuation example

Here's a simpler example using call/cc that needs function-continuation isomorphism to work, and how to handle that need in Factor.
(let ((x (call/cc (lambda (k) k))))
(x (lambda (ignore) "hi")))

In the code above, x is a continuation the first time around, and then a function the second time around (read below for a walkthrough), and it gets called both times with the same syntax. Factor doesn't implement call for continuations, so we need to "fix" it:
QUALIFIED: kernel
USE: locals

: call ( quot/cont -- ) dup continuation?
[ continue-with ] [ kernel:call ] if ; inline

FROM: scratchpad => call ;

[let [ ] callcc1 :> x
[| ignore | "hi" ] x call ]

On the first line, x is bound to the continuation about to bind something to x.
The second line then calls x with a lambda, which causes x to continue, which bounds the lambda to x. Now we've just run the first line a second time.
When we're back at the second line again, x is the lambda, and now it gets passed itself as an argument, which causes it to simply return the string "hi".

Continuations have a reputation of being useless, but I believe this is because they are a bit complicated to code with and that they're rarely implemented as they should be (with a global CPS transform), thus making them slow in most languages that implement them as an afterthought.

Monday, November 14, 2011

From Lisp to Factor

Finally, the search is over.

I've been looking for the perfect programming language for years, and Factor is the one with the most potential to become that.

Before I found Factor I was into Lisp and Scheme, and I was in love with the parentheses. But that's just because I did not know Forth - if I had known it at the time I'd be working on a language that is a cross between Lisp and Forth.

Well, Factor is that language.

For my first post I will translate the good old yin yang trick from Scheme. I was not surprised to see that this works in Factor at all, since it seems that Factor actually copies "the stack and environment", much like the way schemers pretend call/cc works when explaining it to people. [1]

What yin yang does is it prints this ad infinitum:
@*@**@***@****@*****@****** ...

Here's the Scheme code:

(let* ((yin
((lambda (cc) (display #\@) cc)
(call-with-current-continuation (lambda (c) c))))
(yang
((lambda (cc) (display #\*) cc)
(call-with-current-continuation (lambda (c) c)))))
(yin yang))

And here's the Factor version with a safety trigger to prevent it from looping forever:

USE: locals ;

[let
0 :> counter!
[ ] callcc1 [| cc | "@" write counter
100 >
[ "force-stopped" throw ]
[ counter 1 + counter! ] if
cc ] call :> yin
[ ] callcc1 [| cc | "*" write cc ] call :> yang
yang yin continue-with ]


It will print something like this (without the newlines):

@*@**@***@****@*****@******@*******@********@
*********@**********@***********@************
@*************@**************@***************
@****************@*****************@*********
...

If you understand continuations you can read up on the docs for callcc1 and continue-with and immediately understand how this works exactly like the Scheme version (or how it can be made to do so if the safety trigger is removed).

[ ] callcc1 is the literal equivalent to (call/cc id) that you see above in its more verbose version.

continue-with is like Scheme's apply but only for continuations; it takes a continuation to be called, and one argument to be passed to it. Continuations and lambdas in Factor are not isomorphic, so we need a way to call continuations. arg k continue-with is equivalent to (k arg).

I'm using locals above to make it simpler to see, but the snippet would be shorter if I did away with the cc locals:

USE: locals ;

[let
0 :> counter!
[ ] callcc1 [ "@" write counter
100 >
[ "force-stopped" throw ]
[ counter 1 + counter! ] if
] call :> yin
[ ] callcc1 [ "*" write ] call :> yang
yang yin continue-with ]


[1] I believe only a few schemes implement call/cc with any copying at all. It is possible to get call/cc for free, i.e. its cost being the same as of a normal function call, and it is also possible that this way of implementing call/cc is actually easier than alternatives.