Wednesday, February 11, 2009

Smallpox Salad?

Smallpox Salad?
I played with Microsoft's photosynth for a couple minutes today, and in looking for photos for my synth, I found a cute photo I took in Dubrovnik, Croatia. I simply had to post it. That last salad choice is killer.

As for more tech stuff, and the new MamboPics.com, stay tuned... coming in 2009.

Friday, May 23, 2008

A few more little details about pyoffwiki

Old Ostrich II by ldbaker
I didn't realize how many offline Wikipedia readers are out there in the wild until I was almost done with a working version of pyoffwiki. Since I was so close, I decided to take the (usually bad) ostrich strategy of not looking at anything else until I was done. This is good because I actually got this thing released, and people are using it... with all its issues. But bad, because I knew that people had already thought about (and solved) the same problems I was having, and came up with good solutions. Overall, since it was a only a day or two, I think it was justified, and now I can go back and revisit the problems.

The main problem is Indexing. It can't been a huge file like the one generated with Xapian in Thanassis Tsiodras' solution. It also shouldn't be like my solution that allows for only exact search (using cdb) or browsing (sorted title list) starting from a given string. It should be something resembling Patrick's solution. Or even better... a compressed suffix tree, that is, if it's small and fast enough. Time for some more research. Any other ideas?

There is one thing though, that sets this project apart from the others... which has been a design goal for me from the beginning. I wanted to minimize what people have to do on their own machines and to minimize handling of the Wiki database dump data. The first one gets rid of the bad experience of spending 8 hours building an index and finding you did it wrong. The latter makes Wikipedia the unique source of the data... I don't want to distribute 4GB files, torrent or otherwise... let Wikipedia cover the bandwidth costs. It also lets the user feel safer... that the data is actually coming from Wikipedia, and not some random person. (Some people are neurotic like that) I know... the content of Wikipedia is written by... blah blah blah. I still trust it more than many other sources. Any comments on this?

And lastly, to clarify some questions people have had. The full size English Wikipedia is working on the Irex Iliad, but you have to put everything on an ext2 linux partition (for now). The German Wikipedia and English Wiktionary are also working. Pyoffwiki does NOT support any images right now... they are just way too large (about 400GB as of October, 2007). Maybe we shoud download and resize (down) the images for the top (10,000) articles? And it does work on Linux, but I don't see an audience for it... there are other... better offline viewers that don't make the compromises that need to be made on slow and memory limited (ram and disk) devices.

BTW: Ostriches hiding their head in the ground is a myth. Don't believe me? Try to find a good photo... not this one

Wednesday, May 21, 2008

Offline Wikipedia viewer for the Irex Iliad

One of the things that I really wanted to have with me while traveling around the world was Wikipedia. Although internet cafes are plentiful, sometimes you're just sitting in the middle of a jungle somewhere and discussing something like "Malaria", "Snake bites", or simply "Python"... I mean the other "Python". And some random question comes up, like "Was the snake that just bit me poisonous?" Ok... so maybe we should have taken an experienced guide with us, but that's all besides the point. Now you don't have to fret anymore... you can always have your Wikipedia with you.

If you have an Irex Iliad and a 4GB Flash Card, you can carry Wikipedia wherever you go.

Yesterday I released a (hacked together) version of an Offline Wikipedia reader. Most of the work was either cross-compiling Python for the Iliad, finding parsers for Wikipedia's markup, or figuring out how to do fseek() in a huge (3.5GB) bzip2 compressed file. I also had to learn how to program in Python, and write some indexing/browsing code (based on CDB), but that's just details.

I guess some people thought it was useful, since I was already on the front page of MobileRead.com (archived here), but now I wonder if this will make a dent in amazon's kindle sales? (jk)

I'll be covering more about this project on this blog, and over on the pyoffwiki project page on google code. If you have an Iliad and want to use it, head over to the mobileread forum, which also contains an installer for the Iliad (thank you Adam B), or directly to the getting started page.

Tuesday, May 20, 2008

Python 2.5 for the Irex Iliad

So someone posted a semi-functioning python build for the Iliad a while back, but most of the modules (even the help) don't work. Since I wanted to learn python and needed it for a small project on it, I decided to take a crack at cross compiling it.

I have to say that it is one of the more complicated things to build (especially when you know nothing about the language), since it uses the python interpreter itself in the build process. And since we are cross compiling, we can't run the iliad binary on the build machine. Jumping through all kinds of hoops, I managed to get it up and running, and even included a couple of extra modules (such as Django) in the mix.

To download it, head over to IRex forum on MobileRead

Monday, May 12, 2008

Back in the city and onto less interesting stuff

No, I'm not talking about the new Sex in the City movie... I'm talking about myself. Since I gave up my old place in the West Village before traveling, I had to find a new place to live in. OK, so it's not the west village, but the upper west side is not that bad. At least I'm close to Fair Way... then again... I had Chelsea Market before.

Also, now that I'm back, I'll be using this blog for what most of you will find "less interesting" stuff... the stuff I do for "a living". Tech stuff.

But don't fret... I'll probably put more stuff (probably mostly photos) from my trip on my original blog website: AmirShimoni.com I just don't know when that will happen.

Monday, April 14, 2008

Basic Travel Route (The Movie)

A couple of people have asked to see what my trip route was... so I mapped a basic outline of my route on Google Earth and created a movie from it. It's definitely not complete, but a good representation of the route. You can also browse the route on Google Maps.

Wednesday, April 2, 2008

I'm home... exactly 2 dozen countries later

So it's been almost 8 months since I left... and I've visited 24 countries... yet... it feels like I never left NY.

Albania Austria Bosnia Bulgaria Croatia Czech Rep.
Estonia Germany Hungary India Israel Latvia
Lithuania Macedonia Moldova Montenegro Poland Romania
Russia Serbia Slovakia Slovenia Turkey Ukraine