Thursday, November 22, 2012

The hunt for a generic sort

WikiCFP is quite useful for collecting together information about conferences you may be interested in submitting to. It even has a nice google calendar view, where you can see the deadlines of your selected conferences coming up. However, you may need to have additional information about each conference, so it's conceivable that a small database may be a good idea here.

We will use Prolog as our database engine, not only because we absolutely love it, but because XSB offers a very generic sorting routine, called parsort/4:


parsort(+L1, +SortSpec, +ElimDupl, ?L2)
  - L1 is the list we want to sort
  - SortSpec is a sorting specification; can be asc(I) or dsc(I), where I is an argument number. So asc(1) means sort in ascending order with respect to the first argument of the terms in List, desc(3) means sort in descending order with respect to the third argument of the terms in List e.t.c.
  - if ElimDupl is non-zero, then the duplicate elimination is performed before returning the output, and if it's zero duplicate elimination is not performed
  - L2 is the variable that holds the final result.

Okay, so let's start with defining our facts. We will use a separate source file that will contain information about the conferences that we are interested in, stored as Prolog facts of the form

conf(ShortName,FullName,Dates,Deadline,Website).

The first thing we need is a prediate that will take as input a list of conf/5 facts and print out their contents in a pretty way:


print_list(PD,PW,[pconf(Name,Descr,When,Deadline,Web)|Confs]) :-
        write(Name),
        atom_length(Name,NLen), T1 is 10 - NLen, tab(T1),
        ( PD == 1
        -> write(Descr), atom_length(Descr,DLen), T2 is 80 - DLen, tab(T2)
        ; true
        ),
        write(When), atom_length(When,WLen), T3 is 20 - WLen, tab(T3),
        write_deadline(Deadline), tab(2),
        ( PW == 1
        -> write(Web), atom_length(Web,WeLen), T5 is 20 - WeLen, tab(T5)
        ; true
        ),
        nl,
        print_list(PD,PW,Confs),
        fail.

I call the technique used here "print-by-failure"; the call to "fail" in the end of the definition of print_list/3 ensures that things will continue to get printed out until the list is empty. Deadlines are represented by terms of the form deadline(Month,Day,Year). We want to print out information just for the conferences for which the deadline hasn't passed yet, so the following predicate is also useful:



before(today(M,D1,Y),deadline(M,D,Y)) :-
        D1 < D, !.
before(today(M1,_D1,Y),deadline(M,_D,Y)) :-
        M1 < M, !.
before(today(_M1,_D1,Y1),deadline(_M,_D,Y)) :-
        Y1 < Y.

And last but not least, the predicate that will generate a list containing the information needed from the conf/5 facts, print out a nice table and then call print_list/3 to do the rest:



show_confs(PD,PW) :-
        findall(pconf(Name,Descr,When,deadline(M,D,Y),Web),
                ( conf(Name,Descr,When,deadline(M,D,Y),Web), datime(datime(Y1,M1,D1,_,_,_)), 
                  before(today(M1,D1,Y1),deadline(M,D,Y)) ),
                List),
        write('Name'), tab(6), %10
        ( PD == 1
        -> write('Description'), tab(69) % 80
        ; true
        ),
        write('Date'), tab(16), % 20
        write('Deadline'), tab(8), % 16
        ( PW == 1
        -> write('Web'), tab(17) % 20
        ; true
        ),nl,
        writeln('------------------------------------------------------------------------------------------------------------------------------------------------------'),
        parsort(List,[asc(4)],1,SList),
        print_list(PD,PW,SList).

The nice thing with parsort/4 is that we can change the way our output will look like without any change to the sorting routine at all! The only thing we need is a different specification of the sorting specifications. Here, for example, I want to sort in ascending order with respect to the deadline of each conference, which is pretty handy. Try to do all this with 50 lines of C code... :)

Drawing Prolog-generated graphs with graphviz

While I was TAing for a "Computing with Logic" course, one of the assignments had to do with graphs and computing the transitive closure over them with various ways. There was a question in the assignment asking students to identify the form of each graph described. Most of them did a very good job in either using simple text to "draw" the graph, or either use one of the many available software that helps with this kind of stuff. However, drawing the graph yourself means that you actually understand the form. What if you have some really complex graph specification in Prolog; how can you draw it without understanding how it looks like?

Well, the answer is pretty much straightforward; just as when you use writeln's to print out helpful information while the graph is being generated, why don't just output that exact information in a more "concrete" manner which can be parsed by graphviz, and let that draw the graph for you?

And so I set programming this small idea. It ended up not being to painful to do; it was a good exercise, and the result is quite pretty :)

First, here is the specification of three distinct graphs: trees, cycles and something called a "dline":


size(cycle,8).
size(tree,4).
size(dline,12).


cycle(M,0) :- size(cycle,M). % size(M) 
cycle(I,I1) :- 
          size(cycle,M), % size(M) 
          0 =< I, I < M, I1 is I + 1. 

tree(I,I1) :- size(tree,M), % size(M)
          0 =< I, I < M, I2 is I * 2, I1 is I2 + 1. 
tree(I,I1) :- size(tree,M), % size(M) 
          0 =< I, I < M, I2 is I * 2, I1 is I2 + 2. 

dline(I,I1) :- size(dline,M), % size(M) 
          0 =< I, I < M, 0 is I mod 4, I1 is I + 1. 
dline(I,I1) :- size(dline,M), % size(M) 
          0 =< I, I < M, 0 is I mod 4, I1 is I + 2. 
dline(I1,I2) :- size(dline,M), % size(M) 
          I is I1 - 1, 0 =< I, I < M, 0 is I mod 4, I2 is I + 4. 
dline(I1,I2) :- size(dline,M), % size(M) 
          I is I1 - 2, 0 =< I, I < M, 0 is I mod 4, I2 is I + 4.

All right, now that we have our graph specifications, let's see how can we generate a valid graphviz source code and let it draw our graphs. The first thing we need to do is transform a list of generated edges into valid graphviz code:

draw_graph(_Dest,_Shape,[]).
draw_graph(Dest,Name,[Shape|Edges]) :-
        functor(Shape,Name,2),
        arg(1,Shape,X),
        arg(2,Shape,Y),
        write(Dest,'"'),write(Dest,X),write(Dest,'"'),
        write(Dest,' -> '),
        write(Dest,'"'),write(Dest,Y),write(Dest,'"'),writeln(Dest,''),
        draw_graph(Dest,Name,Edges).

The variable Dest holds the IOhandler for the file we want to write our graphviz code in. Name is the name of the graph (cycle/tree/dline) and the list contains elements of the form [name(1,2),name(2,3),...], which means that for the graph "name" there is an edge from vertex 1 to vertex 2, from vertex 2 to vertex 3 and so on. So we just split each element into a functor name and its arguments using functor/3 and arg/3, and then just output everything in graphviz syntax, i.e. lines of the form "1" -> "2" e.t.c.

Next, we need a predicate that will generate the list of edges we talked about above for a given graph name, and call draw_graph/3 to output the graphviz code:

draw(Dest,Name) :-
        size(Name,M),
        get_list(M,[],List), 
        functor(Shape,Name,2),
        arg(1,Shape,X),
        findall(Shape,(member(X,List),Shape),L), writeln(L), 
        draw_graph(Dest,Name,L).


get_list(0,List,[0|List]).
get_list(M,List1,List) :- M > 0,
        M1 is M - 1,
        get_list(M1,[M|List1],List).



Again Dest holds the IOhandler for the output file, and Name is a valid graph name (cycle/tree/dline). get_list/2 just generates a list of numbers starting from M and ending at 0. We then construct a term of the form "name(From,To)" that will be bound to the facts generated by our specifications for graphs from above, thus generating in L the list of needed edges.

The final predicate we need is the one that will do all the file opening/closing, graphviz code generating, and will also run graphviz to draw each graph:



draw1(Fil) :- writeln(Fil),
        atom_concat(Fil,'.gv',File), 
        atom_concat(Fil,'.png',Out),
        open(File,write,Dest),
        writeln(Dest,'digraph G {'),
        draw(Dest,Fil), 
        writeln(Dest,'}'),
        close(Dest),
        atom_concat('dot -Tpng ',File,Cmd1),
        atom_concat(Cmd1,' > ',Cmd2),
        atom_concat(Cmd2,Out,Cmd),
        shell(Cmd).

File holds the name of the output file (which must be a valid graph name as discussed above). We open 2 separate files; the one in which we will put the graphviz code (ending in .gv), and the final PNG picture file. I used atom_concat/3 to concatenate the necessary strings to create a valid graphviz command, and just call it with shell/1. That's pretty much it! Now you can just fire XSB, load the source file containing the code we wrote here, and ask for draw1(tree), draw1(cycle) or draw1(dline). You'll then get some really pretty pictures from graphviz :)





Wednesday, June 6, 2012

Painting graphs

Some time ago, a question was asked in StackOverflow about how we can use Prolog to compute the (or a single) coloring for a graph. After some thought, I came up with the following solution, which of course I am not really sure is correct, although it seems to give the correct results on the cases I looked at.


Let's start with describing our graph; we will use edge/2 facts to represent eges. A fact edge(a,b) means that there exists an edge between vertices a and b, as usual. Similarly for the colors, we will use color/1 to represent the available colors for our...painting :) So the initial database looks like this:

edge(a,b).

edge(b,c)
edge(b,d).
edge(c,d).

color(blue).
color(red).
color(green).





And now, for the real work :) We will represent the final coloring as a list of Vertice-Color pairs, where of course there should be no more than 1 pairs with the same vertex as the first element. As a base case, we will use what intuition says: vertex a gets a color c if there is no edge going out of a, assuming that we have colored the path(s) leading to a. In Prolog, this gets translated as:


coloring([V-C]) :- 
 color(C), 
 \+ edge(V,_).



Next, the recursive case. We will append a V-C pair on the front of the coloring list, if we can paint V with a color C, there is an edge from V to another vertex V1, and coloring the rest of the graph yields a color to V1 different than C. This is a brute-force algorithm; it tries to assign all the possible colors to vertices, until it finds a particular coloring that satisfies all the constraints. Translated in Prolog:


coloring([V-C,V1-C1|Coloring]) :-
        color(C),
        edge(V,V1),
        V \== V1,
        coloring([V1-C1|Coloring]),
        C1 \== C. 



Finally, we need a way of getting all the possible colorings for a graph. That's easy, using the findall/3 predicate. It works as follows; given an answer pattern in the first argument, calls the goal given in the second argument, and returns a list with all the results in the third. Pretty cool, huh? :)


:- import length/2 from basics.


colors(X) :-
        coloring(X),
        findall(V,edge(V,_),List),
        length(List,Len),
        length(X,Len). 



We just need one more thing; to check that the length of the solution is actually as long as the number of distinct vertices of the graph (we don't want solutions that only give colors to some of the vertices, right?). And that's it! Just call 


|?- colors(X). 


from your favorite Prolog interpreter, and you'll get all the possible colorings for your graph! Feeling like Picasso already? :)

Tuesday, March 13, 2012

The art of Term Expansion

One of the features I most love about Prolog, is the built-in availability of "Term Expansion". That means that the programmer has the power to change code on-the-fly, as it's loaded into the compiler. It's basically a preprocessing step (in XSB I think it's the last preprocessing step before actually feeding the code into the compiler), and the user can enable it whenever they wish. Actually, DCG's in Prolog get handled using term expansion; remember that in DCG's, we have to add 2 more arguments to each head predicate.


Okay, now that we know what Term Expansion is, let's see how we can use it. In XSB (and I think in every other commercial Prolog compiler), there's a term_expansion/2 builtin, which describes how the first argument should be transformed into the second before it gets fed into the compiler. That means that we won't have access into the old version of our code after term expansion, and that's fine. If we need the old form of the code as well, we can always assert it dynamically :)


I usually start off my term-expansion-preprocessors with the following assertion:


:- assertz((term_expansion(X,Y) :- !, do_term_expansion(X,Y))).


That means that this rule will be asserted last (in the end of the database). So,  term_expansion/2 will automatically be called while loading the file, and control will be handed to our  do_term_expansion/2  routines. Next, let's handle the "end of file" case. Luckily, XSB has a builtin predicate  end_of_file/1  which we can write at any point of our program, and will signal...well...the end of the file :) We want our preprocessing to stop when the end of the file is reached, so voila:


do_term_expansion(end_of_file,_) :- !, (import member/2 from basics),
writeln('end of file!'), fail.



In this case, I just want to import something from the basics module, write that the file has ended, and fail. Now for the interesting part. Firstly, suppose the original program contains static facts of the form  uses(A,B) and distr(A,B). Let's just assert them in our post-processed version:


do_term_expansion((uses(A,B)),_) :- !, assert(uses(A,B)).
do_term_expansion((distr(A,B,C)),_) :- !, assert(distr(A,B,C)).



Now these will be available to the post-processed version. Finally, let's say we want to just throw away the body of each clause (not a very useful transformation huh? :) ); that's easily done as follows:

do_term_expansion((Head :- Body), (Head :- true)) :- !,


And that should work :) You can do a bunch of helpful things with Term Expansion; like DCG argument-adding (that I mentioned before). I used Term Expansion in a project for adding probabilities in Well-Founded Semantics (where we wanted to add extra arguments in each predicate - sounds familiar? :) ).



Moral of the day: Use term expansion. It's (sometimes) easier than writing regular expressions in Perl or sed!

Sunday, March 11, 2012

Hello World!

Okay, so I have been thinking for some time to start a blog, and write about what bothers me, what keeps me up at nights, what keeps me sleeping at days... :)

I guess I'll start with the title; append/3 is the first Prolog predicate I actually understood how it works. And believe me when I say "there became light".


append([],L,L).
append([X|L1],L2,[X|L3]) :-
    append(L1,L2,L3).




append/3 appends its second argument in the end of its first, and returns the result in the third argument. But not just this. Let's first say we want to concatenate two lists. So, we'll call append/3 with arguments 1 and 2 bound, and 3 unbound:

| ?- append([1,2,3],[4,5,6],L).

L = [1,2,3,4,5,6];

no

Cool, so it does what we want it to do. How about something...kinky? :) Say we want to call append/3 with arguments 1 and 2 unbound (i.e. replaced by variables), and argument 3 bound. Then, magic kicks in, and argument 3 is split into all the possible lists that when concatenated together give that as a result:




| ?- append(X,Y,[1,2,3]).

X = []
Y = [1,2,3];

X = [1]
Y = [2,3];

X = [1,2]
Y = [3];

X = [1,2,3]
Y = [];

no



That "magic" is the power of Prolog, and computer science people call it "non-determinism". It means that one Prolog query can have multiple answers. And that's just the beginning of it :)




Prolog uses assign-once variables; that means that once a variable gets a value, that value can't change throughout the execution of a program. You'd ask yourself why on earth can something like this be useful. And I'll just reply with a quote: "In Logic, variables are also assign-once; you can't change the value of a logic variable during the evaluation of a logic formula. And since Logic has been around long before Programming Languages were, you can see who messed-up the meaning of 'variables'".