Friday, April 17, 2009

Precise Algorithms

I think that algorithms described in a paper should have accompanying source code.

I was giving a presentation that basically summarized "Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Methods" by Belkin and Niyogi.

I had this bullet point:

  • Nodes i and j are connected by an edge if i is among n nearest
    neighbors of j or j is among n nearest neighbors of i

I made comment about how this was not a symmetric property. A person asked a question about why it was not symmetric.

My answer was basically, its symmetric but depending on how you parse the sentence it can be un-symmetric. One way the OR is part of the algorithm, the other way, the OR is an or in the implementation of the algorithm.

One way is to say:

(defun edgep (i j)
(or (n-nearest-neighbors i j *n*)
(n-nearest-neighbors j i *n*)))

Another way:
define edgep like this,

(defun edgep (i j)
(n-nearest-neighbors i j *n*))

or define edgep like this,

(defun edgep (i j)
(n-nearest-neighbors j i *n*))

So, yeah, if you are describing an algorithm, and there is more than one way to do a step, be clear whether you mean that you can do it two ways, or that they have to OR the results of two sub-steps. I mean, it wasn't that ambiguous. But it shouldn't be able to have multiple interpretations, even if one seems like a stretch, cuz some unsuspecting grad student will find it.


Spatchcock said...

My parse:

Nodes i and j are connected by an edge if:
(i is among n nearest neighbors of j)
(j is among n nearest neighbors of i).

Not being a lisp expert, I can't say which of these three possible pseudo-implementations I'd choose, although I feel safest with the first. Moreover, I think one would have to call it symmetric.

Can you put parentheses into the original sentence to resolve the possible ambiguities? Do these correspond to the usual rules of operator precedence?

Advanced Compiler Design & Implementation spends an early chapter defining a pseudocode used throughout the book to specify algorithms. It's quite heavily featured [includes currying, returning of functions], and every algorithm is presented this way [perhaps only this way].

I liked that.

sstc said...

hopefully you don't need any lisp experience to get the drift of what I was saying.

I like the idea of parentheses in the original sentence. I wish that I could use them more when I have long sentences with many ors and ands. (when the stuff being AND-ed is made up of OR fragments, or something...)

I was going to specifically ask about the Knuth book, but he made up his own assembly language, correct? for the book to use?

Kudos to ACD&I. I just wish journal papers were more rigorous sometimes.

Spatchcock said...

One point to add: I found De Micheli's Synthesis and Optimization of Digital Circuits next to impossible to precisely comprehend. Completion of the homeworks required hours of deliberation between several of us and frequently required consulting the original papers from which the algorithm was obtained.

Nothing reveals ambiguity like struggling to learn and perform an algorithm the night before said performance is due.

That said, given time and resources, the best way to learn something like that is to implement it.

Spatchcock said...

Knuth defined his own assembly language to demonstrate how these algorithms map to machines. He also did this to span a three+ volume set so the investment was worth it. Then, he revi$ed the in$truction $et to be more RI$C-like and re-relea$ed TAOCP.

You can't exactly define your own language in every journal paper. You can however borrow/cite someone else's [perhaps 'Informal Compiler Algorithm Notation' used in ACD&I].

Or just use Python.

sstc said...

Python works... but what if your lambda function needs to be more than an expression? (doesn't happen all the time, but it does happen)

Python would work for most things, its quick to pick up and just get some code banged out to do something useful without crying in frustration about things that don't matter to the code you are writing.