Creating an HTML file with white space traces
Posted by Tariq • Tuesday, December 9. 2008 • Category: One liners
Once upon a time I was comparing some Java files. Now one way to find plagiarism among students who share program code is to look at the white space traces in a file. Students typically try and edit the source code: changing variable names and function names; however, they usually fail to conceal the original author's white space patterns. White space consists of characters which don't appear on screen such as spaces, tabs, and new line characters. Anyway you can catch lots of cheaters by looking at the white space patterns in files ... maybe that's another blog entry, so back to this one.
Today I am making html files for each code file (java in this example) in a directory and highlighting the white space in that file in pretty colours. We will highlight space, tabs, and new lines. Here is a one liner to do that:
Cool! Results below and as you can see white space isn't as uniform as you might expect.
Today I am making html files for each code file (java in this example) in a directory and highlighting the white space in that file in pretty colours. We will highlight space, tabs, and new lines. Here is a one liner to do that:
for file in $(ls *.java); do echo "<html><head><title>"$file" Source</title><style>body { font-family: courier; background: #EEE;} .tab{ background: #F00; }.space{ background: #0F0; } .eol{ background: #00F; }</style></head><body>" > $file.html; cat $file | sed "s/</\</g" | sed "s/>/\>/g" | sed "s/ /\\<span class=\"space\"\>\ \<\/span\>/g" | sed "s/\t/<span class=\"tab\"\>\ \ \ \<\/span\>/g" | sed "s/$/\<span class=\"eol\"\>\ \<\/span\><br \/\>/" >> $file.html; echo "</body></html>" >> $file.html; done
Cool! Results below and as you can see white space isn't as uniform as you might expect.
0 Comments
Add Comment