Here and Now: 2009

Sunday, December 20, 2009

Avatar: the goods and the bads (IMHO and spoiler-rich)

Yesterday, I have seen the movie Avatar (IMAX version). Can't help compiling a little list of reflections on it (full of spoilers, so if you haven't seen the movie, may be you don't want to read it yet :) ).

GOOD: This was the first time I ever saw the whole movie (3 hours) in 3D. Visually, it was amazing: an alien world which didn't look the same as our own and had many interesting and thought-out details - strangely-looking animals and plants, mountains flying in the sky (hello Myst and Miadzaki), the biological civilization, all inhabitants of which are connected to each other and to their planet (Ursula K. Le Guin, Hainish Cycle, yes?..), flying dragons of different colors and styles bonded to their masters (I have a soft spot for dragons, provided that they fly well above my head ;) ), the technological wonders - skin-thin and semi-transparent tablets which seemed to be the laptops of the future, robots which would make their counterparts from Alien, Star Wars et cetera to cry from misery, glimpses of the spaceship which looked like a bit humanized version of Matrix, definitely with more money involved... yes there WAS something to watch at. THE WHOLE THREE HOURS of very beautifully drawn world.

Which immediately brings up something BAD: the movie was a bit too long. 3 hours, it's on somewhere at the limits of focused attention (_my_ focused attention, at least).

And then another BAD but of a different tint: the movie felt long because at some point, it stopped being logical - even for a fantasy movie (and this was a fantasy movie, in the same sense as Star Wars is a fantasy movie; the technology here only plays the role of setup, but the point of the story is not to test how some _scientifical_ hypothesis can develop, but just to tell a fairy tale which took place far, far away).

I'll elaborate a bit more on that. Fantasy is a realm with very strange rules which are different from our own. But nevertheless, they are present and they should be honored. "All things have rules", according to Neil Gaiman, the author of a movie Stardust which was much lovelier as a book, in my opinion. I have felt that somewhere in the last hour of Avatar, this rule has been broken and after that, the world ceased to be real in the sense of being inpredictable. To me, it happened after the main personage has taken that mightly red dragon who nobody else could have taken, using his own dragon for this (with whom, according to the rules spelled out short before, he should have been bonded for life).

How come? If this was possible, why didn't anybody of the locals try the same trick? Didn't have the vibe? Not a single person with ambitions on the whole planet? They were not supposed to be stupid, after all.

Some other things which felt wrong:
- The pilot lady which refused to comply with the orders and flew home. How come she was not sitting together with the other three guys in the same cell? (Or even worse?) Saying "this wasn't in my contract" sounds like a bad excuse and would not save her from being court-marshalled, had such situation happened in the real life.
- How come that she could still start the plane after she obviously has been stripped of the pilot rank? This seems to be a basic security breach. Can't believe that somebody like the guy who had all that army in his command could oversee such thing. He would have been dead long time ago in this case.
- Wasn't there any winter on the whole planet? (Now OK... this is fantasy so it can happen).
- If everybody was supposed to live in harmony with each other, why did they have warriors for? Why didn't they tame the animals to use for food? (Well OK again... may be it just was more fun to hunt them after all).
- The shamanic ceremony around The Tree of Souls looked very pathetic. I have to show this movie to my 11-year-old daughter to check her feelings, but I am afraid even she would not buy it.

Well, I don't want to finish it on bad note... As a show, it was good to watch. AND it's a real FANTASY movie made with a really BIG BUDGET (no cheating!) But the problem is, this movie didn't make me think about anything else apart from its inconveniences. I awe the hard work of the people who made it (their fantasy is truly amazing), but I hope next time when somebody tries to do a similar thing, they'll pick a plot with more interesting twists.

Wednesday, October 28, 2009

Dependencies, dependencies...

Follow-up to the previous post...

People always have been creating works inspried by another works. After all, the books, papers and blogs are just the other modes of communication, and in the everyday communication we do react at something the others say.

Before internet era, though, things were different in the sense of how references have been maintained. If you have been writing a book, you would quote your sources. They were, in principle, localizable. Everything that has been printed has a number of attributes with which it can be more or less unambigously identified: the title, year of being printed, the name of the printing house; later, the ISBN number which is not the ideal solution, but it's fair enough solution to live with.

Therefore, if you read the book and the author is quoting his and her sources, it was always theoretically possible to locate the sources and to find out what the work has been based upon. It was not always easy (suppose the quoted source existed in one exemplar, or was not quoted directly, etc), but it was doable.

Now, we have Internet, and everything is supposed to be stored there forever.

As I write it, Internet is full of derivative work.

Most of the articles have links to the other articles, which in their turn have links to yet another articles, and so on. It's world wide web, so it's normal.

And these links are in fact as firm as the gossamer threads. May be gossamer threads are even better. Ever seen not working links? (I bet any article which exists in the web for more than one a couple of years probably has them, either directly or indirectly).

I can't stop wondering what will become of all this in several dozen years or so. Will there be a big pile of everything, to which every article that has been around long enough would be assumed to go? If somebody in 20 years would want to trace how ideas were spreading via internet, will it be possible to do? Will these sites still exist, will they still be pointing at each other? How to understand what somebody is writing about if his or her references are gone? Unless there would be a searching engine which would have a knowledge about the evolution of internet - may be it will be one more use case for Google or for some other company - temporal data mining :)

A pessimistic scenario is that, notwithstanding Internet, blogging et cetera, everything below a certain level of importance will have the same chance of getting completely into oblivion as the derivative work of earlier generations - or even a bigger one. The letter which the 8-year-old Babylonian boy would scrabble on the calfskin existed as an object, and would continue existing until purposefully destroyed or until the skin crumbles (which would take many years). The blogs which are hosted on a certain site are obliterated completely if the site closes down and nobody takes care of the content. (Theoretically, information still can exist hidden in the hard drives, but how quickly will the old hard drives be disposed of?)

I hope we will dedicate enough work in the near future to prevent the web from losing its structure with time. Keeping the dependencies will be quite difficult part of it... I am very curious how this problem will be solved. If it will be solved.

Blogs quoting blogs quotinig blogs... Once again, what about copyright?..

As I see it now, it seems that quite some part of blogs around us is nothing else than a collection of links to some other blogs plus, optionally, some comments of the blog author and / or related blurb of the blog followers (which is actually the very essence of the existence of the said blog).

This is, per se, a great approach, although some can say, people are plagiarizing. (Well, some of them just do.) Nevertheless, all these links and images are in fact forming that very context which we are so often missing when talking to the other people. Now, the problem is solved: just look into his or her blog (if he or she took care to put one together) and we can, at the very least, figure out some more interesting conversation topics than weather or politics.

(In fact, a blog post without a link looks like a dead end, or like a book without pictures or dialogues... and I realize, I don't have any one in my mind right now... I guess I'm doomed!).

Two questions, regarding illustrations and dialogs:

1. Who owns the copyright for the pictures people quote?

2. Who owns the copyright for the dialogs (blog followers discussion)?

Both can be an issue.

First, the pictures.

Suppose I want to be a good one and post into my blog an image as a link to an original image which I liked (for example, the one from this page - at the time when my blog is written, this should link to a page with the largest treehouse existing now, I like those). I could have copied the image into my own place, of course, but this would be the copyright breach, so I don't.

Suppose NN years pass and my blog is still alive, but the blog to which I linked to is removed or restructured. Therefore, my link will no longer work, and the image will no longer be available. In the context I was trying to recreate there will be a hole... and as a matter of fact, this can happen any time - even at this very moment - only with different probability.

Moreover, may be the owner of that page has been forced to remove it because he or she could no longer afford the hosting and he or she in fact would not mind at all that people are still keeping the slices from the site? How could this be formalized easily? (It can but right now we don't have these mechanisms in place).

Now, the dialogs.

This is even more vague story.

First, in most blogs any reaction can be deleted by the owner (or edited beyond recognition). What will this make to the fabric of the dialog?

Some blog portals get around this by restricting the rights of the owners (not allowing them to change their replics at all, or allowing this only for a short time, or allowing deleting but not editing, et cetera...) No formalism here either. No guarantee that a sensible talk will not deteriorate with time into a set of chaotic remarks (it does not happen very often now, but there are all technical possibilities to make it possible...)

Am I again way too paranoiac? Well, would be glad if so!

...but I really, really would not want that in a couple of years, it would not be possible to understand what were the people writing about. Granted, Internet is supposed to be a living matter, like a living organism, some structures arise, the others die out... but isn't Internet also supposed to be a place to keep our memories and feelings (and the blogs are very good representation of them!)

Right now, all information appears to be around forever; but the truth is, it isn't. There is so much dependency everywhere.

There is the web archive, of course - but it doesn't keep the blogs as they were, only a couple of snapshots at most, and doesn't preserve the deep structure (like in forums)...

So the question still remains.

Monday, October 26, 2009

For all eternity...

In memoriam

Eternity, that old Egyptian cat,
Is coiling in the corner for the moment,
Serene and calm, apparently off guard.

But in a whim, all might forever change,
A magic lamp of Chinese porcelain
Gets broken and the little spot of light
Is yielded to the memory, until
It also fades and darkness fills its place.
A swarm of dreams, a meshwork of suggestions
Is suddenly eternally promoted
To wishful thinking. Wait... what is "eternal"?
A handful dozen years, and then one day
They'll find your breathless - shall we call it "body"?.. -
Whatever you have left as parting gift
For others, who, with grief or with relief,
Will take the care of putting it away,
For all eternity.

And what of you?
What of that point, from which you could behold
The skies, and plants, and planets, and the people?..
The orphan web of messages in words
And dreams, reflecting light which is no more,
Is no good answer. Where could have you gone?
Eternally unknown. As they say,
When Universe gets tired of work and play,
We'll see each other at the other side
Of starry sky.

It's not to verify
For all eternity.

But hope remains.

Good-bye.

Sunday, October 18, 2009

There was once a person utterly devoted to God. It so happened that he was present at a ship which started sinking. Nevertheless, the person didn't panic, for he was confident that his Lord will see to his safety.
So here he was, in a cold water, when a boat passed by with some lucky survivors. "Hey, jump in!"- shouted to him the people there. - "No, thank you, my brothers!"- responded the person. - "God will not leave me alone!"
After a little while, a raft was getting closer, with more people who managed to escape the wretched ship."Hey, get up here, bro!"- called they cheerfully. -"No, thanks!"- was the response. - "My Lord will save me when needed!"
After some more time, a log was floating past our hero, who already started to feel cold and weakened, but still was firm in his faith. -"Get hold, quickly!"- moaned the wretched man who was also struggling against the water. -" With some luck, we'll keep up until they find us!" - But he, too, had to go further alone.
Soon enough, the person we talk about lost his last strenghts and drowned...
...and there he was, in the Paradise, among the angels playing celestial instruments and a cheerful crowd made entirely of those who went to church every Sunday.
Lo and behold, the Lord Himself was there, smiling in the way to get a thousand of Giocondas envious.
"Why didn't you help me down in the sea?"- was the first question uttered by the newcomer.
The Lord sighed.
"My dear son, I haven't anticipated that you are... how to put it... a little bit more slow-minded than I hoped for. For I have sent you help three times, and three times you have rejected it!.."

And now, the story.

It all started with the question, why is there an image - no, a meme - of "evil genius" which is so persistent in many human myths? And why, at the same time, his counterpart - a good hero - can have many great qualities, but quick or strong wits are not necessarily among them?

Speaking about the evil geniuses, Cain (the inventor or agriculture and the founder of the first city, also the ancestor of those who invented music - i.e. arts - and blacksmithery - i.e. technology) - is the first one on record in the Bible.

In the European (both East- and West-European) fairy tales, there is often an image of evil wizard who possesses great power and chooses to use it against humankind. Mind you, if there happen to be other wizards who are not that evil, the bad guy will undoubtedly be the smartest and most skilful of them. (Think about the Middle-Earth epic for the most generic examples - all wrong doers there, Morgoth, Sauron and Saruman - were always the best in their class).

And if the genius does not happen to be evil, he is often portrayed as the one who doesn't have all his nuts in one box. A weird scientist who can figure out how to build a time machine, but will be utterly helpless outside of his comfy technological lair. As if from the three - being fit, being smart and being good - you are only allowed to pick two!

Well, what happens when the good heroes do need some wisdom to get by? They all tend to get it by some miraculous chance - be it by finding for themselves a wise fiancée or by coming across some wizard-in-disguise to whom they do a favor (strong guys are often so naive that they don't mind helping the others without expecting anything in return ;)) et cetera. I can't remember many tales which would start with the story of some diligent master who acquired a great skill in something and never was tempted by the dark side of the power. (The quickest example springing to the mind - Star Wars, a modern epic which might be related to some older myths- the always-good Obi-Wan is excelling at swordplay, whereas a villain-in-the-making Anakin has started as technology prodigy...)

It seems to be reiterated so many times in so many forms: people, don't tinker with technology, it might bring you into trouble before anything else happens and it is not even needed for your well-being!

One other Biblical story comes to my mind: this of Mariah and Martha, when Martha is busy doing the cooking and cleaning the house and the only thing Mariah does is sitting near Jesus Christ and watching him lovingly. Martha feels like she's left alone to struggle, she starts to scorn her sister and gets mildly rebuked by Jesus, who tells that her sister has chosen "the better part".

Isn't Martha here an embodiment of our technological, scientific and other activities? Then the Bible says very clearly: people, that's not what you need.

Isn't it a bit puzzlying?

Why create a world with a set of complicated rules just to say that those who chose to explore these rules are not doing the better part? Are all these rules just a disguise, and the real rules being hidden? (Bible appears to hint that the real rule is to love God - and therefore his creations, people among them - and nothing else than that).

Is God a being who has problems with self-respect? That could be a good pretext for the creation of the sophisticated world with the only purpose: to find out how many persons will discard all pleasures of that world and go searching for its Creator, full of love and devotion. (Make sure the pleasures won't last long, otherwise what if nobody would?)

I am not an atheist, by the way, but I can't explain this luddism, this mistrust in science and technology lurking behind so many theories. (Ancient Greeks were free from it, by the way - Hefaestus was a good inventor and Prometeus was a suffering hero who wanted the mankind to get better).

May be this resistance is supposed to be the force opposing the progress to keep the humanity from rushing along too fast? This can make sense...

Nevertheless, I wish there will be more space for the Greek way of thinking, especially now, in the XXI century. If we are drowning, we are allowed to use whatever comes at hand, why not? And if we aren't, then what was the whole point of these things coming by in the first place?

Thursday, October 15, 2009

...

All that recent turmoil around the "headscarves and handshakes" reminds me about the joke where a drunkard late in the night tries to find his keys under a street lamp. Not because this was where he dropped them, but because this is the only illuminated spot.

Wednesday, October 14, 2009

I feel in the mood for somewhat darken forecasting today. Let's see if there is enough gunpowder left to make it happen...

Do you remember a SF movie (for teenagers, of course - who else is more eager to buy every little hint that the world can be different than their parents so dully insist?) called War Games? In this movie, a socially inept, but cute-looking little prodigy accidentally (with the help of second-hand modem and duct-tape coding efforts) gets connected to Pentagon super computer, has no simple idea what he is getting into and doesn't care (being a teenager's mother I totally believe that bit!) and starts a game which appears to be a nuclear war scenario run for real (this bit is the greatest assumption in the movie, but we'll come back to that). He is going chased down with half of the US armed forces, but miraculously escapes them, somehow tracks down the way to solve the problem and saves the world in the latest moment. Happy end, and the young nerd-in-making even managed to get himself a cute girlfriend as a bonus.

30 years later, Gary McKinnon, another no-longer-kid (still looking rather cute, by the way) tries to research whether US armed forces are holding back some information about alien visits to Earth (I think there was a SF movie with a similar plot!), breaks into Pentagon computer system (not as far as the kid from that earlier movie though) and, cutting long story short, now he is facinig extradition to US and spending rest of his life behind the bars. No cute girlfriend here, only a desperate mother going all possible routes to save her poor prodigy from all that. Very sad story with a very uncertain outcome.

To me, this sounds like an explicitly clear message from the governments involved: listen, cute and smart guys and girls, Internet is no longer your playground and computing machines are no longer your toys. We use them for real; don't mess with us, or else.

It is not the first time. Remember the story of Kevin Mitnik? He got away relatively easy, in comparison to Gary, but the message was essentially the same.

My bet is, this message is going to be reiterated again and again. But that is not all.

How about blogs and all information we put into them? Right now, it is very unregulated.

Everybody can blog under any persona, imagined or real, without legal problems;
A blog provider can shut down the service, is not obliged to ensure that the created content does not get lost and can even claim their rights over the content we create;
There a grey area whether the information found in blogs can be used as a legal evidence.

Legal people being the first ones concerned to keep their jobs intact, I expect this might change,, sooner or later, possibly along the following routes:

There can be introduced a licence for being a "information pool provider" (or call it as you like) for those who run a service allowing the public to create and upload their own content;
The registered providers will have to comply with the law which will, sooner or later, come into being. Among the requirements there can be:
Preserving the content and going through legal motions before shutting the service down to decide what's going to happen of it;
Providing the official structures any data access they require in a number of situations (no doubt they will be scrupulously described);
Providing the means to verify the identity of those who create the content.

It's the identification bit I am mostly interested about. What will it become when it settles? Will there be some smartcard like the banks use? Wireless implants? A universal ID bound to social security number?

Of course, there will always be unverified blogs but they can also be demoted in the public opinion into the area where good people don't go... every city has those, we are passing through them briefly or for fun but no person in his or her safe mind would choose to live there, right?

Am I paranoiac in thinking about all that?..

Monday, October 12, 2009

Abstraction versus Reality...

The need for abstraction comes from a simple fact that no single human mind can hold the whole endlessly complex picture of anything. Therefore, at some level we all have to cut off.

Whatever is hidden below, will be all thrown together and represented as a set of simple interfaces at that lowest possible level which we still can access (colored dots, lines and planes, mnemonic set of rules, or whatever else you prefer). How these interfaces are really implemented, and what lurks deep within, we often have no idea, and in most cases, don’t even want to know for fear of losing the “whole picture”.

As a consequence, because our internal representation still remains incomplete, we will sooner or later tamper with those unseen dark whirls, folds and clouds too much. Then they raise above our cut-off level and bring the unpredictable chaos along.

And this will be the system talking to us.

Sunday, October 04, 2009

If our understanding is organized in layers, then:

- whenever there are too many layers for us to build, the presented concept would appear too cumbersome;

- if there is nothing new to add to our internal webwork or ideas, we label the text as boring and/or trivial;

- “nice reading” is when we can use whatever is at our disposal to get the meaning and to add one or two new festoons to our own mind’s garments without much effort;

- and “fascinating” or “encouraging” is when the concepts presented, even if we can’t get them immediately, lure us into building yet another layer of meanings for ourselves, just in order to be able to finally decrypt the message which those strangely beautiful reverberances of somebody else’s mind seem to be holding within.

Sunday, September 13, 2009

Found a remarkable piece in The C++ Standard Library of Nicolai M. Josuttis (very useful book, by the way) in the beginning of the chapter dedicated to valarrays:

The valarray classes were not designed very well. In fact, nobody tried to determine whether the final specification worked. This happened because nobody felt "responsible" for these classes. The people who introduced valarrays to the C++ standard library left the committee a long time before the standard was finished. For example, to use valarrays, you often need some inconvenient and time-consuming type conversions...

How many bells did it ring to you, dear software development experts? And we are talking about C++ standard here :) (More revelations about design flaws and omissions could be found in the chapters about the other Standard C++ library components, - bitsets,for example).

By the way, the book has been written 10 years ago and it's still relevant because not many things have changed there. The work to introduce a better C++ standard is going anything but quick.

This probably explains, at least partially, at least to me, why younger folks are turning to the other languages (I mean Java-based crowd) which are largely feedback-driven, with the ability to adapt and/or evolve quickly, and which don't have a gloomy committee overseeing the Grand Design in the authoritative way...

Thursday, September 03, 2009

Confessions of a newbie ereader user

It is 2 days now since I have received my first ereader device. This one. Cybook OPUS. I have spent quite some time of these 2 days trying to come to terms with this thing... probably it's good time to put all this experience together, until it's safely forgotten / moved into the realm of the unconscious.

What do I like about the toy? (In no particular order)

1. It is cheap. 250 euro is not that much.
2. The screen quality is not bad at all. 200 dpi, 800x600. And the eyes don't get tired because there is no backlight. Just like a book.
3. It has 1 GB of internal memory. Wow!
4. The connection with the PC is not too painful. When the device is connected with USB cable, it is seen as an extra storage (even Vista doesn't mind, although it keeps prompting that it might be necessary to scan/fix it every time I connect the device ;) ). Putting files on it and getting files from it does not seem to be a problem.
5. They give away the source code under GPL. Seems to be a nice move.
6. The device has native PDF support.
7. Accelerometer, which allows to reorient text automatically when you turn the screen. There are issues though.
8. You can have any amount of books "open". When you "open" a book the next time, it will be the last page you have read. Nice!
9. There is out-of-the box support for 20 languages, including Russian. That also means one can have the UI in any of those languages. And the directory names. And the file names. Nice! Apart from that, Bookeen (the producer) claims that Chinese, Korean etc can be available for text files once you put the proper font into /Fonts directory but I haven't checked. Besides, there seem to be an issue with text files... now, is it time for booes? Yes, it is.

What I do not like that much? (In no particular order either).

1. The screen size. 3x4 inch is a bit too small for PDF's. Well, not _extremely_ small as you can have 70% of original size (not without some tricks, though, if the original article has side margins), but I would prefer to have 80% instead. But then, I am highly miopic and it feels even with lenses, so may be those who don't have problems with eyes would not really mind.
2. Bug #1. It doesn't recognize UTF8 text files. As simple as that. I have tried to read a Russian text file saved as UTF8 and Cybook thinks it's ASCII. I have tried UTF16 format too, same result. How did they check that they can read text files in other languages? Which encoding they used? I wish I knew. Definitely not these I've mentioned.
3. Bug #2, discovered today. If you switch accelerometer off, use the device for a while, then switch it on again and quickly turn the device upside down, the thing stucks. It stucked so profoundly that RESET (hidden under the battery cover) didn't help. The only thing that helped was to disconnect the whole battery and connect it again (the image on screen still appeared frozen, but after I switched the device ON, it obeyed).
4. Speaking about PDF scaling, why keep the left margin on scaled down PDF's? I have solved this problem - removing the margin from a PDF I just wanted to read - (it was this one, which is a bit above my humble understanding abilities, but still seems to be as interesting as a fairy tale in the language I pretent to know;)) via GSview (you can get one from here if you didn't know it already). This resulted in zooming to 70% instead of 56% which I could achieve using the device own option "adjust to the width". Believe me, it does feel as an improvement.
5. EPUB files are read just fine. Unfortunately, there are not really many formats the device can understand: PDF, EPUB, TXT and that is all. (They claim they can read HTML, I didn't succeed with that, may be my HTML files were not created the right way? From the other side, EPUB is nothing more than a zipped XML + CSS...) What about DJVU, FB2, PRC and other pocket reader formats?.. (BeBook can read all of these, by the way, and is a tad larger with its 6-inch screen; I am considering buying it too as I am not the one in our family who likes reading...) I will say a bit about conversion options later.
6. The file names in the browser seem to come from the internal properties of the file (for a PDF, it will be "Title"). Noticed issue: if you replace a PDF file on the device with another one having the same name, the device will show the "name" corresponding to the"Title" of the previous file. Apparently it never gets deleted somehow...

All said, the biggest problem, for me, is the screen size. Why are most ereaders released in Europe so small? Even 6 inch is not that much. There is IREX with 8.1-inch screen but it costs 2 times more!

A couple of tricks & more whining:

1. About conversion. I have tried Calibre, wanted to create a ePUB from an FB2 file. Dont do that. At least, don't do that if you have a non-ASCII file (French, German and Swedish people don't think it will not affect you with all your diacritical signs :) ) The thing is, if you have a file with Unicode characters, you are supposed to have Unicode fonts. And these fonts have to be embedded into ePUB file. Calibre either doesn't do that, or I don't know how to switch it on. Besides, it produces rather messed-up xml.
2. The best solution I have found for fb2 conversion: Google conversion tools. They also have doc and rtf to ePUB converters there, but I haven't tried them yet.
3. For the text files, here is an interesting link: http://web2fb2.net/ .
4. For the djvu, the way to convert seem to be via one of the free PDF printing utilities: FreePDF or doPDF. Then you can print your file and specify one of those as the printing destination. As a result, PDF file will be created; quality is more or less OK but the pages with pictures might be blurred when seen in smaller size, and the resulting PDF will be bigger than the one produced with Acrobat itself.

A really crazy wish!

I wish somebody would be brave enough to release a eReader combined with a multilanguage dictionary.

Now, it's probably not all, but I am tired :) Hope it was not all competely in vain, and thanks to the people who responded to my tweets regarding the matter!

Friday, July 31, 2009

Thinking about thinking...

Humanity, if we give it a half of a thought, is such a loosely-coupled system. The evolution of human thought goes forward in leaps and bounds because every single "thinking unit", in order to produce _anything_ more or less valuable, should first spend some amount of time picking up the existing knowledge (and nevertheless, there are still endless rows of reinvented bicycles in any chosen area). No guarantee that the knowledge you or I pick up is exactly the very knowledge we need. Thanks to Internet and searching engines, what used to be an art (remember the stories involving a scientist or writer looking for a key, spending nights in libraries and then accidentally glimpsing upon the very reference he or she needed to break through?..) becomes more and more a technology, a routine.

Is it good or bad? Every time a certain activity becomes more predictable, it looses some of its charms. Learning how to produce the fire at dawn of human history might have been a thrashing experience, an adventure in itself, a part of a rite passage into the realm of the adults. No adventure in lighting a match or a lighter nowadays (unless it's a child who accidentally puts the house on fire, of course - but safety devices evolve as well!) So, is it already written upon the invisible walls that there will be no adventure in scientific research, either? So much that we will be able to give it away to the machines, let them chew information and milk them for the useful results? What about us, the self-proclaimed kings of nature? Lots of confusion arises as soon as more and more people begin to realize that more possibilities will bring not only quantitative, but also qualitative changes in the way we live.

Will it mean that the way humans are evaluated will change? Why do we have a high estimation for philosophers, writers et cetera? Because of their possibility to evaluate information and make links between pieces which have not been obvious to the rest of us, until those enlightened have pointed it out? Because of them creating works of art which create a strong resonance in the recipients? How much of this will get "outsourced" to the artificially created information processing units?..

Of course, there is one thing which is never unnecessary to point out. The machines ,both hardware and software, are not a manifestation of an alien mind. They are nothing more than a summary and a quintessence of our own mind, - we are those who created them all. Nothing more than a codified experience of humankind, the very experience which until lately have been transferred only via painful learning process for every single human being.

Still, it concerns me that the young people of today learn how to use Google for their schoolwork bypassing the own thinking about the results. It can bring a scary situation, with the young generation becoming like monkeys using the machines where the experience of an older generation is stored, without actually understanding it. How real is this threat, is difficult to tell. There is no similar experience in human history; we can try to produce mental models of the possible outcomes in the novels and movies, but in the reality the only thing we can do is to observe, to think, and, may be, to act if we will feel like it...

Will all people become eventually united into a sort of huge network, therefore making sure that not a single experience gets lost? But how to ensure that this experience is passed on? To read about something will not produce an adequate responce unless you had a similar experience yourself - the words are often used as reminders rather than descriptors. Is it a reason why history tends to repeat itself from time to time? Would there be a chance to eliminate it (with virtual reality experiences, for example? What are our dreams -especially the "scary" ones - as not a sort of virtual reality experience either? And yet the dreams can influence the way we live...)

Still, it feels so interesting to see where the things are going...

Sunday, July 12, 2009

On translations, the choice of words, and software development :)

A couple of days ago, I have received a link to an interesting article about the woes of translation (in Russian) - thanks to Dmitry Reznitsky for it. For those who can't read Russian, it is about the funny consequences of the fact that in Russian almost all words have a gender prescribed by the grammar rules. Therefore, when a gender in the original language differs from a gender for the same word in Russian... Houston, we might have a problem.

And a problem indeed. Take the well-known fairy tales. In Winnie-the-Pooh, the one who had to undergo a gender-changing procedure, was the Owl (every Russian kid knows that the Owl is an old ladylike creature in lacy bonnet who desperately tries to appear wise, but can't because of her sclerosis or something - thanks to the very popular cartoon). In Alice in Wonderland, the Caterpillar, the Dormouse and many others had to become women, too. (The author of the article rightly notices that some scenes become completely absurd as a result). And then, well, The Jungle Book and the brave warrior Bagheera who, in its Russian version, is a charming and fetching slim black panther (this fact is noticed to the English Wiki because of yet another Russian cartoon which was, believe me, not bad at all, even with a female Bagheera mutilating the originial plot :) ) The translator could have chosen another word, at least in this case (the word "leopard" which means the same has masculine gender in Russian), but it didn't happen, and now it would be extremely difficult to correct this mistake, because the existing translation is sort of canonical, because every Russian from the age of 4 couldn't help having seen aforementioned cartoon zillion times already, et cetera...

So, this is a good demonstration of how the choice words can influence the meaning.

Now I would like to have a couple of words about sailing. I have been participating in the short sailing trip a couple of months ago. What was the most revealing experience is how important the work correlation is when you are on the boat. Everybody must somehow work together, otherwise you won't go anywhere. All crew has to be, so to speak, in sync. More to that, the other boats should also take yours into account, as well as you should keep them in view and react in the way that is beneficial to the others. There are rules defined with that philosophy, and you should honor them. You just can't not to. Also, it is worth noticing that the crew always has a clear purpose - to get somewhere - and it is always possible to say whether this purpose has been achieved or not. If it has been achieved, all crew gets the credit; if it hasn't, all crew failed (from inside, of course, you always can organize scapegoat hunt, but the outside world can't see them on their own).

Very different from footbal, for example, or baseball, or rugby, where not only the purpose of the team seems to be nothing else but to win from the other teams (keeping to the rules just as much as not to be punished for not keeping to them), but any particular team player can be competing with the other team players for the recognition from the public. The team doesn't create anything. The team is here to win, that is all. If they win, the good guys are rewarded, and the losers are eventually sent away.

The third example of human collective which I know about is a choir. I sing in one myself. The philosophy behind choir singing is, once again, different, but it seems to be closer to the philosophy of a boat crew, than to the philosophy of a team. The choir has to be in sync. The choir has a clear purpose (to deliver a song). And this purpose is achieved only if everybody manages to work together and the conductor can establish proper rapport with the choir.

Now, finally, what does it all has to do with the software development?

I'll try to explain.

Usually, when one speaks about software development, one speaks about "teams". "Teams" are doing "tasks" with the purpose, apparently, to participate in the process of creating something useful in the end.

I have got the impression that the team metaphor doesn't fit very well. If the purpose is to deliver something, rather than just "win", then it seems like the SW command should be a "crew" or even a "choir" rather than a "team".

Team metaphor means there can be losers and winners within a team, and that the outside world knows about it too. Also, it means that if the team loses, they might have an inclination to find the reason for their failure in the fact that some outside authorities were too harsh to them. And of course, there is a lot of competition: with the other teams as well as within the team itself.

May be we would be better off emphasizing the need of the team (or crew) members to find the ways of cooperation with each other and to concentrate on keeping a goal in focus (like the people on the boat should do). I sometimes feel that too much energy is spent on nitpicking the microstrategies and sometimes it goes at the cost of the things which really matter.

May be I'm terribly wrong in all that, but I am sick of the talk about "teams". As simple as it is. We are not writing software only to win from those who might be writing the same software, but had less luck. We are writing software to create something really useful. How can we create something really useful, if the main emphasis is on competititon rather than cooperation?

I really would like to be off from the analogy of a team, in sweat and soap, which tries to win at any cost, even if that means changing the goal or bending the rules in other unpleasant way. Cooperation and synergy is what we should be aimed at. Things which help achieving this for the particular environment, are good. Things which hinder the process, are bad. And one can never say in advance what will work for this particular... crew, collective or how else you decide to name it. A collective consists of people; there are no two people alike, how there could be two collectives alike, then?

That's all banter I have to deliver on this, for the time being ;)

Thursday, July 09, 2009

Twitter Haiku (well, and some micropoetry too)

The ones (till July) which I find more interesting:

little flocks of birds / in azure transparent sky / not a day to die

You cannot enter same river twice / flip over 'to come' and you'll read 'to go' / let the memories take your place / while they still grow

the poetry flows / from cracks in the souls / it takes to be blessed / to write when not stressed

sky that's full of stars / and the chaos that's within / we are in between

it seemed like a voice / I've reached and found a wall / beautiful echo

Feel wordless tonight / my wandering thoughts forgot / return to my mind

turmoil in the soul / either the heart is too quick / or the mind too slow

вся душа вразнос / сердце мчит без тормозов / разум не дорос

hard-resetted phone / and destroyed all my past / what an easy fix!

birds sing in the dark / before the light ever came / there was a music

the words draw contours / the actions add colours / the purpose brings life

love wakes up in the heart / like a child almost ready / to be born / kicking from inside / and smiling already

ducklings in channel / males already advancing / on stupid females

3 knots at its best / sun splinters in the water / murmur of silence

bottom of the night / wind is dancing outside / carefree and wild

reaching deep within / what is flickering therein? / neither Yang nor Yin

I left the haven / few scratches that's all it took / the sea is all mine

Покинула порт / Что мне пара царапин? / Все море - моё

сырое небо / деревья замирают / в объятьях бриза

the sky is all wet / the distant trees are trembling / hugged by the breeze

thunder far away / wants to mark the passing day / bright instead of grey

гром гремит вдали / день, который мы прошли / светится в пыли

pointers to functions / do-it-yourself hand grenade / just point and throw

the world is music / you want to participate? / try singing in tune

color of the wind / and the touch of kindred soul / nothing more than that

the waves try to reach / my feet - but - shy or tired / they keep retreating

the world stays in blue / mixed with gray clad in rain / missing the sunlight

night stitched by lights / from cars lamps stars and windows / from wishful thinking

garden of passions / if not tended properly / turns into jungle

child peering at me / from behind a green train seat / mischief managed!

the sweetest tales / are woven with blood and tears / from stinging nettle

once in a while I have funny feeling / that words can be used to recover the meaning :)

crouch the railway / oh train / but slowly, slowly

pale lights quick shadows / light dreams dim reality / all intertwined

plane rumbling above / for a moment outcries / a desperate heart

рёв самолета / лишь на миг заглушает / сердце в разладе

chilly taste of despair / sultry waves of desire / iceberg on fire

the understanding / starts when I question myself / who I truly am

driving in the heat / dozing off when try not to / dream within a dream

noli me tangere / I'm afraid to disappear / into thin air

at the sunlit roof / doves are tapping messages / with their beaks and feet

you want me to land / on the runway of your hand / make me understand

The verb 'to love' makes little sense / both in the past and future tense

birds try their voices / and make the morning sparkle / with jewels of haiku

at the crossroads / only the moon from all signs / looks into my face

на перекрестке / из знаков только луна / мне смотрит в лицо

the stars are dancing / to the rhythm within my heart / murmuring their songs

my words intertwine / with the phrases of others / and cease to be mine

you're not the wizard / you do your magic from books / but don't give up yet

a quiet morning / my room is bathing in light / like a crystal cave

ты не волшебник / ты колдуешь по книгам / но не сдавайся

тихое утро / комната светом полна / как грот хрустальный

piano playing / whirl of red and yellow leaves / dancing in the sky

whirl of sparkling words / whatever sky it brings me / there's no place like home

words are just symbols / footprints of reality / or blueprints of dreams

the rain is dancing / barefoot on biking lane / only for itself

eternal cadence / from hope to desperation / life as usual

self-referring dream / endless sticky Möbius strip / waiting for the flies

twilight in chaos / no longer trust my own thread / it's part of this maze

night tiptoes around / shy to tell it's time to sleep / and she waits for you

singularity / from darkness to dance of light / from chaos to love

bicycles swarming / around railway stations / catching sun glimpses

The rest is here :)

Tuesday, May 05, 2009

A long banter about learning and getting tired

Everybody wants to be effective nowadays. That means: do the most in the least availble time. So many things to do, so little time for this. Unfortunately, the capacities of humans are very limited. From time to time, we need rest. If we are tired or if there are too many distracting things, our attention gets lost. As a result, nothing really gets done.

I am an easy victim of all that, which tries to do her best and become effective. Recently I have learned about the technique of self-micromanagement called Pomodoro, which, to put it roughly, means you plan your day ahead, prioritize what are you going to do, split all these things into ~25-minute chunks and see how many such chunks (they called "pomodoros" because that was how the kitchen timer of the author had looked) you can do in a day. At the end of a day, you do a retrospective and give a thought what could have been done better.

In essense, it is an agile methodology of sorts, but applied on individual basis and not necessarily for the activities related to software development... although, I have to say, while software development is quite suitable to "pomodorize" or to make it agile in any other way, other activities might be more stubborn to handle like this. The key in such techniques is your ability to estimate beforehand how much efforts you (or your team) will spend on a given task / on a "standartized unit" of efforts. Of course, you are supposed to develop better estimating skills with time. In addition, it is expected that you will get some demonstratable result at the end of every chunk of activity. Otherwise it's all nonsense.

For now, I am convinced that while it is feasible to do this in the fields where your activity is predictable (more or less), the more place is left to to creativity in your work, the more difficult it becomes to estimate when you will be finished. Can you plan creativity? It comes unexpected. The workaround is not to include creativity into the estimations, I am afraid. If it comes and makes things easier for you, OK; if not, there should be a plan B to finish the job, may be in less elegant way. But when you are writing a poem or a musical piece, you can't stop after 25 minutes are gone, otherwise you'll lose it forever. So in these cases the structure does not apply.

But I am distracted already :) What I have noticed is what happens when you are more or less tired (it becomes more visible then). Not sure if it is a common phenomenon, but in short words, I see the following: the more tired I am, the less "quanta of new information" I can get from one activity without a tendency to switch. When I am very tired while trying to do something (e.g. reading an article which I know I have to read, but it does not go easily), then I tend to switch off _after very first piece of new info which I have processed_. If I am less tired, the amount of "new info quanta" will be more but the "moment of saturation" is always very distinct.

How to mitigate that? One technique I have found useful is to make side notes. This way, the brain does not have to keep the new data in the short-term memory (probably) and you can keep on reading for longer; besides, making notes and reformulating the information in your own way is a very helpful exercise when you are learning new stuff. And it is great for retrospectives, if you will be in state to make them afterwards :)

Therefore, it makes sense to estimate the difficulty of the text when planning to read it. For one type of information, it might be 1-2 pages, for the other type of information (e.g. which is not entirely new or which is not perceived as very important, so that the strain to remember everything as exactly as possible is less) it might be 10-15 pages or more in 25 minutes. I guess for the same book it will be more or less constant or the difficulty might be increasing as you go (for example, the excellent textbooks for theoretical physics written in Soviet Union by Landau and Lifshitz were notoriously known for a high gradient of difficulty - the influence of a genius Landau of course - which made quite some of my fellow students - who were tired all the time almost by definition - to use Feinman course instead, which had a smoother gradient).

The most difficult of learning activities, it seems to me, is learning language; in this case, you _are_ supposed to put all this stuff into your own head and nowhere else; that makes even the drills a hard effort when you are tired. I have found that if I did not have enough sleep, I would be OK with reading email, twittering or discussing life with friends or kids, but concentrating on the forementioned Rosetta made me half-slumbering in 10 minutes or so. The brain was sending frantic signals that it was overheated. In such situation, I think only two outcomes are possible: either to stop all activity and go to sleep or, if for some reason you can't, to do it in very small chunks: 10 minutes or so - so that you get 3-4 such "quanta" and no more.

I wonder if it would ever be possible to measure the "informativeness" of a given text relative to a given person. Even to measure an "absolute informativeness" could be interesting. Suppose some time in the future every book or article will have an "information value" printed on the cover or below the title? Or a matrix of values relative to some areas of knowledge? Might be considered as an insult by some authors probably...

Well, that's all for now, thanks for the reading, and I hope the information value of this piece of text is positive :)

Monday, April 27, 2009

Ideas, thoughts... rushing at you, like gusts of the wind at the cobweb, and coming through, and all you are left with is some filmsy patches with threaded edges...

Long time ago, the nights used to be black-and-white. Now what we see, or at least what you can see walking along the channels at night in the centre of Amsterdam, are different shades of ochre against dark blue sky. Little neon lanterns, starring into the night with a miriad of orange eyes, did that trick. Orange and blue are psychodelic colors. The night in Amsterdam is psychodelic by definition.

Still, Amsterdam, the center, is one of the most beatiful places at night. Walking down the quiet streets along the channels with the dark glistening water, along the big trees with spacious crowns made from fresh spring leaves, passing here and there little cafes and restaurants, brightly lit, with the late public quietly talking at the outside, near big mansions huddling together along the channels, just like 200 years ago...it feels almost like being in love; may be, even better, for Amsterdam will always give you some love back - you can be certain of that...

I feel myself almost helpless now, playing with bits of pieces at the entrance to the big and misterious grotto of the English language... it's all what I can do...

My train is coming to my destination. Thank you, Amsterdam, for a beautiful night.

Sunday, April 19, 2009

Years and humans, folks and heroes,
All will run away for good,
Like the streams from thoughtless flood.
In the Nature´s vibrant mirror
Stars are fishnet, we are draught,
Gods are wights from mirky naught.

from Velimir Chlebnikov

Wednesday, April 15, 2009

Rosetta Stone: the goods and the bads :)

As I am almost finishing the complete Rosetta Stone language course (I took Spanish levels 1-3) I think I am in a position to more or less summarize my experience, the goods and the bads, so to say.

The goods (in no particular order):

- once you've got a subscription, you can access the course from every computer, every operational system which has sound input/output and graphics;

- you can train your pronunciation for the given language;

- the learning sequences are created in such way that you really barely need any explanation while learning new words and even phrases;

- you'll get a good (although very basic) set of really "common denominator" words and standard phrases, including everyday vocabulary (e.g. basic colors, parts of the house, furniture, body parts, common activities etc);

- you will pronounce, hear and read the same words and phrases so often that you will definitely remember some. Repetition is the mother of learning;

- with some luck, you'll be able to pick up some grammar features: tenses, cases (if applicable), mood etc.

- at the end of the course (complete 3 levels) you'll seem to arrive somewhere in between A1 and A2 levels (see explanation of them here). You will be able to understand basic stuff and even utter some very basic sentences. Of course, if you will stop using the language every day, you'll forget it very quickly, but that's another story.

The bads (also, in no particular order):

- The context. The principle of Rosetta Stone (hence the name) is that all languages are treated exactly the same way: you'll get learning sequences describing the everyday situations in the context which is supposed to be standard Western life environment, but to me, it seems that it is US urban life environment in particular. I mean, it doesn't look European enough :) Jokes aside, you are learning the language completely separated from the native language context. It is OK to have this knowledge if you are going to talk to, for example, French or Chinese people who have emigrated to US, probably. But expecting to use it in the land of the given language seems to be a bit far call.

- Logically follows from the previous one: you learn the very standard vocabulary, no colloquial words or phrases, and practically no synonims. That's enough to make yourself understood but not always enough to understand (unless the other person is ready to help) - exactly the description of A1 language competence level.

- "No explanations" principle doesn't work well for the grammar (especially for the complicated parts, like tenses and moods of the verbs). As I've already mused in my previous post, you'll probably create some system based on what you have been exposed to (and it's way less than the real child would be exposed to in the real life!) but there is practically no guarantee that this system would correspond to the real grammar 100%. Remember, when you have been a child you went to school and got grammar lessons in your own language. And even before that, when your mother or father spotted what they perceived as a systematic mistake in your speech, they not only corrected you, but probably also tried to explain the rule behind it (e.g. "when you talk about what one person does, put an "s" at the end of the action word: Pete reads but we read" - or something like that; I do not know exactly how English-speaking parents do it, but I am almost confident that they do it.).

- The clarity of the course seems to dwindle by the end, plus the learning curve becomes too steep (not enough data for the real osmosis, see above). You simply don't have time to figure out how on Earth all these tenses and mood are constructed. As a suggestion for improvement, I would propose adding formal grammar reference in some form.

- You don't really exercise in speaking (i.e. in creating phrases yourself): you are always supposed to create phrases according to very rigid patterns (which does not require extremely advanced speech recognition software, as far as I know :) ) - no synonyms, changing the order of the words even if it's valid, etc. A bit boring.

Summary: Rosetta is a reasonably acceptable tool for getting the very basics of the language and/or memorizing some everyday vocabulary. If you want anything more than that, you have to get it from somewhere else :) E.g. if you want to get a feeling of the language, you need to try putting yourself into the context (and if we are talking about "osmosis like a child does it", books for little kids are a great resource). Otherwise you'll learn a projection of the real language into a different context, if that's what you need, fine, but you'll encounter the limits as soon as you try to get beyond social trifleties and start speaking/reading/thinking about Stuff That Matters.

Important: The price of 6 months subscription for Rosetta Stone is 180 euro (last time when I checked). It is a bit at the high side but still not skyhigh. I've got the subscription through my employer though.

Saturday, April 04, 2009

I have just started to read the book "Language Instinct" written by Steven Pinker. The beginning of this book, and my recent experiments with Rosetta Stone course (I have been following Spanish course and almost done with it by now...) have inspired some thinking.

First, I really don't like Rosetta's principle to rely to full osmosis, when it concerns the grammar. I'll try to explain why.

The point of Rosetta Stone system is that every human being, when he or she was a child, has learned his own language via osmosis, therefore if you try to (very roughly) simulate the environment in which the child learns the language (as much as the structure of the computer program allows it) you can teach an adult person the new language without actually explaining anything, just by trial-and-error method.

Well, to some extent, it works. It is a great way to remember the meaning of words. But but but!.. The grammar!.. Don't you have to go to school for the grammar lessons in your own mother tongue?.. And if you don't, will you be speaking properly? (You know the answer - chances are big that you will end up speaking some dialect but not the language the way as it should...)

Now, Pinker comes into picture. He mentiones there the interesting phenomenon known as "creolization". The schema is the following:
- first, many people from different cultures are brought together; they don't know each other language and have to choose some "common" language (historically, those chosen languages happened to be English, French, Spanish, Dutch and Portuguese) to commicate with those who brought them (the bosses) and the other coworkers (the peers).
- as a result, "pidgeon English" (or whatever) comes into being - not a language, but rather a crude simulacrum, most importantly without a consistent grammar; you need lots of nonverbal help to understand those who speak it;
- then, as soon as there are the children who get to know this language at the age of language acquisition (below 6 years), they will create the "right" grammar and make the language suitable for communication; this, according to Pinker, is how the Creole languages were created;
- same process took place in other situations (for example, Pinker describes the development of the languages deaf people use in US and in Latin America).

What is the point? Well, for me, one very important point is that if you don't get a formal education in the language you speak,you end up creating dialect which will mirror the way _you_ understood how the grammar works. Examples: the Creole languages; the dialects of ethnic or social groups; may be, even the Roman languages (which have all, more or less, stemmed from Latin long time ago) initially could fall into this category.

The explanation is also quite simple: a language (real language) is a fruit of labor of more people then any dialect. This gives the mainstream languages their finesse and beauty. A small group is just not capable to do the same - it does not have the resources for that. A dialect can be nice but that's all you can say about it....

Which brings the following conclusion: if even in the content-rich real life environment without any additional education you might never really master the grammar of the language you learn to speak, in the extremely meager learning environment you won't get perfect grammar, either. Which is fine with me because I would not be using Rosetta Stone as a single point of reference for language learning (can't help remembering the words of polyglot Ilya Frank: learning language is like assembling little threads into one huge ball; you should get these threads from all possible places to succeed). But they really shouldn't claim that their method is the only thing needed to master the language.

Nevertheless, as I have already noted, for memorizing basic vocabulary and grammar constructs Rosetta is great. But there are several things you have to take along: a good grammar reference (with exercises), plenty of books (starting from simple ones) and then, live speech examples (e.g. radio or podcasts). And all this won't save you from very clumsy way of communicating until you create your own little collection of set phrases and canned responses :)

Still, learning language is great. It would be even greater if a system for language learning would have been an open-source one, with the possibility for everybody to add new modules, and with the dictionary which would also show the context usage of requested words. But for now, it looks like a dream... (and it's high time to go and get some :) )

Language belongs to everybody, because it can't exist without everybody's input to it. Therefore, it can never be copyrighted. (Which is good) But the works created in this language can and do get copyrighted. Where is the line that distinguishes between part of the language and a creative work? Is an aphorism produced by a person copyrighted? Will it no longer be copyrighted if everybody starts using it on regular basis, and just become part of the language? (Which is actually true). How about longer pieces of work? If anybody would now create a work which is a spinoff from a Shakespeare tragedy would he or she have to pay money to Shakespeare's heirs? (Apparently not). And if this will be a spinoff from a Harry Potter book? (You know the answer). It's only an arbitrary law which distinguishes these two situations from one another, which tells that you have to wait a given number of years after the author dies to be able to freely play with his/her ideas; doesn't it actually hinder creative process?..

Back to the topic: I wonder whether one day the right to learn a language would not be considered one of the basic human rights. I also wonder whether the same would be applicable to translations, and if true, what will happen to all those companies which are producing language courses, dictionaries and the like. Will they become fed from the public money? (If there will still be money at that point in time...)

Monday, March 16, 2009

I wonder how is the knowledge of the outside world is represented in the current AI models?..

Just had a thought that we, internally, don't represent separately the "data" and the "actions". Rather, we see the world as "objects", where data and possible actions are combined.

For example, an apple:

- is a member of a class "fruit", and as such, it can be: ripe, unripe, good, rotten; we can eat it, peel it (if it has a property "detachable skin"); cook it (if a property "cookable" is on, according to our representation of this class), slice it;
- it is also a member of another (sub)class "small round object"; as such, we can throw, catch it, pick it up, lay it down, etc;- it is a member of a class "objects with a value", so it can be sold and bought;
- ...
- last but not least, as a member of a class "real object" it can be seen and touched, etc...

This pretty much the same in every language, and it will not pose a big problem in translation to most of the languages originated on Earth (at least, to these which speakers have seen apples on routine basis).

Now, there is a second part:
- "Apple of discord" could be understood only if you know some Greek mythology;
- "Adam's apple" requires yet another piece of cultural knowledge
- "Apple a day keeps a doctor away" is a set phrase in the English language, and its meaning would be certainly understood by any other reasonable human being, yet (for example) no Russian speaker (if he doesn't know English) would say something like that - Russian language doesn't have this knowledge incorporated as a set phrase! (The closest one which I could remember has a bit different meaning, but still represents the idea of escaping the guys and girls in the white coats the following way: "If you want to stay healthy, stay away from them doctors!" No apples or other means are mentioned though)

There are plenty of other connotations for an apple, involving various fairy tales, stories, movies and pictures. Some of them belong to one language, another require some cultural or social background to be understood. This is the most difficult part when it comes to conveying the meaning, in my opinion.

Apart from that, "apple" in English rhymes with "grapple" (I'd also try to use "ample") and it's a 1-syllable word (as far as I understand). "Apple" in Dutch becomes "appel" (2 syllables), and in Russian "yabloko" (same origin, 3 syllables. Not sure about Dutch, but in Russian it does not easily rhyme with anything. This can pose a problem but problems of this sort might, technically, be tried to get solved by brute force, provided the translation engine has enough knowledge of both target and source languages and instructed to look after the words with similar sound pattern (but there is no guarantee that such solution would exist).

When I am thinking about the second part, I feel a sort of despair. How could we correctly get the meaning of more complex utterances than mere descriptions of an outside world across the language barrier?

It seems to be that in order to understand an idea in more or less the same context as was intented by the person who expressed it, the recipient has to be made aware of the context (to the necessary extent). To achieve this, the translators of old times (and good translator of new times) have been adding to the books dozens of comments and explanations, where they thought them to be necessary for the average consumer of their work. It will still require some conscious effort on the recipient's part, of course.

Another approach was to translate the context so that the recipient will get "the same idea" but in the context to which he/she is accustomed. This has been mostly done in children literature. The most prominent examples in Russian are completely "localized" versions of Pinokkio (known as "The adventures of Buratino", where there is a capricious blue-haired doll instead of the Blue Fee and the little wooden boy doesn't want to become a real boy at all!) and "The Wizard of Oz" ("The Wizard of Emerald City", which was so popular that the author of the translation/migration wrote 6 sequels to the story; as to my knowledge, nobody it Russia is much interested in the original even now). The more recent attempts of this approach manifested in various coutries for the translations of Harry Potter series (having translated them up to letter with miriads of comments would have definitely spoiled the fun for the little readers).

The summary is that up to now there is no predefined procedure to decide which approach to undertake in every particular case, and no measure of success for any particular translation. After all, sharing the same native language is not enough to correctly understand the other person; social and cultural context is equally important. What do we want to share: the ideas which we came across, the emotions which we feel, something else?.. Will we ever be able to share both the ideas and the emotions in the adequate way, so that the language barriers become transparent? That would be the good AI usage for me...

Tuesday, February 17, 2009

Among all social applications, the one ¨social¨feature I miss is ¨social dictionaries¨.

I hope they will be coming sooner or later. How I see it, is to be able to see for every word how often it is used, not only on average, but also for particular social groups (for example: different age, different education level, different location / countries where he/she lived, profession... ) A lot of statistical data which would be impossible to collect before and which, in this time of overall obsession with statistics, seems to be more or less feasible to collect. This could help a lot to learn the usage of the words in the foreign language correctly. For example, if I know that there are two words with roughly the same meaning, but one of them is mostly used by teenagers and another one often surfaces up in the official documents then I have less chance to make a mistake with these two words.

Of course, this is a very rough example... the definition of the social groups is per se a separate topic. I could imagine having a number of parameters and a possibility to combine them for a definition of a group.

The big question is: how to collect this information? OK, more and more people are now present in the Internet. How do we know to which group a person belongs? We don't want to have tags on ourselves, do we?..

From the other side, there are already a lot of social networks and their users do provide some information about themselves by themselves, while also providing the content for the public domain. Could it be possible to use this content somehow for such "social dictionary"system?

Of course, this leads directly to the questions of privacy. I wonder if there are already some thought about the evolution of the concept of privacy in the digital world. There is some anxiety about whether we will have much left to ourselves at the end. People tend to have secrets and people are submitting more and more data into the network. There are some mechanisms in development aiming at privacy protection though. That means there will be always some part of the content hidden from the public domain. One might wonder how much difference it can really make for the sort of "social tagging" proposed above.

In the meantime, I miss the "social dictionaries", because I am not content with having a row of synonyms and no clear clue how to distinguish one from another.

Even more precise approach would be to describe each meaningful word or phrase using some formal language (as each word or phrase is actually describing some point in the space of ideas which we all share). Then we could see how closely two words from the two different languages match. May be the same space is covered by one language (or dialect / argot) more densely than in the other one, and these subtleties are chipped of when the text is translated. (There is a well-known example of many varieties for the word "snow" in some Nothern languages, but it's already an exotic variety, there could be more interesting ones, for example describing people's feelings or actions in this or other area).

If we could find a way of mapping any text into this multidimentional space of lexical invariants, then many things could be possible. Comparison of the translations for the well-known texts (books etc) into different languages. Comparison of the different texts and finding similarities between them. Finding the associations, too (we can see with which words/phrases this particular word/phrase is often combined, and we can see if some words / phrases sound alike and therefore could be associated one with another... finding the words which sound alike is feasible with soundex algorithm already). Finding out if the texts of a particular author share some specific traits (specific vector families in this lexical multispace).

Then of course, any text is just a more or less successful attempt to express the thought. Words cannot be trusted as they are just labels for the reality which we chose to share between ourselves. But it is a fascinating area nevertheless.

Tuesday, January 27, 2009

...

Long time ago, people used to believe that the Earth has been the center of the world, and everything else has been revolving around our little planet.

To the naked eye, it seemed reasonable and didn't contradict the everyday / everynight experience.

Then the telescope has been invented. It allowed (at least, to those who have taken the trouble to get one) to see not only the Sun, Venus and Mars, but also some more distant planets. Apart from that, people were becoming more and more skilled in math, and started trying to use their math skills to predict the way the planets move.

So began the trouble.

It just so happened that the planets moved in quite tricky ways. Instead of just rolling in giant circles along the sky, it seemed to the Earthlings that the planets were revolving in circles around the circles (it was the only possibility to bring at least some harmony into these so unexpectedly stochastic movements). For these weird paths, the special name has been coined: epicycles. But why the planets behaved this way? Probably there existed some theories on that behalf. For the reason explained just below, only the people who are interested with history of science, and of astronomy in particular, might still know a bit about them.

The reason is now well known: somebody happened to get another idea, namely that the planets have been indeed revolving, but not around the Earth. The Sun appeared to be the real center. And if you would accept this idea then there were no need for these complex epicycles, because the planets were simply moving along nice circular orbits (which is way better and simpler to calculate). Later on, it appeared that these orbits were actually a bit elliptical, but this is another story.

There has been some struggle around this idea, and other subsequent ideas which shifted the center farther and farther away from our little habitat, but finally, the people had accepted that the Earth is not the center, the Sun is not the center and even the center of our Galaxy is not the center of everything. Because it helped to simplify the representation of the world, and the simplicity is the key to beauty, and the beauty is appealing to the humankind way more than complexity. (We can go on and on from here).

The point of this is very simple, but has to be stressed now and then: if your world starts getting way too complicated, it might be time to shift your point of view.

Monday, January 19, 2009

Open-source seem to be a good candidate for the next round of tech-related hype (just after the dotcoms and web2.0). Ideas which have been implemented more or less quietly by a bunch of enthusiasts (like Wikipedia, for example) are already cloned (Google's Knols) - open-source is a philosophy which allows to grab the ideas of others and reimplement them, as long as you don't forget to mention the original authors (if they can still be traceable)...

Several things which I can't help noticing:

1. I can't understand how the old patent system and open-source approach to the copyright are going to coexist together. For example, Google is now actively promoting their new mobile phone, Android, and encourages young, enthusiastic and (often) not-well-paid-yet guys and girls to write new, exiting, and no doubt, open-source applications to promote Android even further (the best ones are getting the blue ribbons indeed). What about the patents for these ideas? Who is going to hold them at the end? These guys (I have asked one of these winners at the Android Developer's camp in Amsterdam) seem to never think about this.

2. What about the data? Who owns the data collected by Wikipedia? Who owns the data collected by e.g. TomTom (my previous employer) via MapShare technology? Who owns the data collected by OCLC (my current employer) via WorldCat? (This one is actually the flamebait right now, the other two have not been discussed, or I don't know about it). Who owns the data collected by Google via their system of blogs (like this one :) ), webmail, etc? Is it possible to talk about the data ownership in this case, or only about the ownership of the data maintenance structures and the right to use the data because of that?

2a. Don't forget, "data maintenance" should include data verification and systematizing, which is not the work for novices and requires quite a strong discipline, like every necessary routine . It also implies guaranteed all-time access to the data (which means hardware issues), and these parts are not that easy to open-source. Distributed computing is one of the answers to that, but the payoff is performance. If the responce time is supposed to be critical, then you need to have some dedicated hardware and it has to be supported, therefore somebody is supposed to get paid for it.

3. Open-source, in itself, is not a silver bullet. If you compare an open-source solution and a closed-source one from the point of actual costs you spend to get what you want, the result might be quite surprising: fixed costs for closed-source solution (price of the software + maintenance costs, which is usually agreed upon beforehand) versus open costs (nothing + salary of the guy or guys who are going to support this piece of software for you).

4. Software is not good or bad because it is open- or closed-source. It is good when:
- it is clearly documented;
- it is well tested;
- it is well supported;
- it is written well.

All of this can go wrong for open-source project as well as for closed-source one, but in the case of open-source project, if you happen to be the client and the project dies, you are left with nothing because nobody has been responsible. The documentation is not the thing programmers like to do; you won't find much documentation even for Mozilla (one of the longest and extremely popular open-source projects). The wider scope the open-source project has, the more important becomes the question of organizing the "crowdforce"; it is more difficult than to order around guys who are reporting to you because, after all, these enthusiastic guys and gals work for fun!.. If you spoil the fun, they'll go!

5. The term "crowdsourcing", to be honest, makes me remember the work of José Ortega y Gasset called, in English, "The Revolt of the Masses". The point of this work is that unorganized masses are actually the reactionist force. The crowd, if not organized, is less smart than any single member of it. What is needed for open-source project to succeed is the organized crowd, so that it really becomes the whole where various parts play various roles for the common good. If the project leaders manage to achieve and keep this state, good for them. If they also know what they want to achieve with their project and can share their vision with their team and get their support in return, excellent. That is how it should be. But there is no guarantee that it will always be like that.

6. "The human factor" becomes extremely important. People have emotions. You never know what makes the other one explode until you are swooshed away by the blast. If the project becomes big enough (on the order of several dozens of active participants) the conflicts will be inevitable. The project leader has to be prepared to solve them, sometimes by force, and to have enough self-confidence, so that the others would not question every decision he makes. This is a tough part.

7. Finally, the disclaimer: I am quite enthusiastic about the open-source approach, but that is why I can't stop thinking about the ways how it can fail. It would be a real pity if open source, as a result of the coming hype, becomes an "anti-buzzword", like these dotcoms of the past century.

Update: Open-source has its charms, there is no doubt about it. It allows the developers to become experts in the areas they like, and to stay experts no matter what their official affiliation is. It makes the workforce more mobile, because your expertise with (the development of) open-source products is often easier to take along to the next assignment than the expertise with (the development of) closed-source products. Therefore, it makes the work for the developers look more like hobby, which is good for the morale. That is why it is so important to make sure that open-source initiative, now that it came into open light, will stay healthy.

Thursday, January 15, 2009

The gist of Goedel's theorem seems to be found in the fact that our language becomes a pitfall when it is used for self-reflection. Any formal system which we could have thought of (till now) has the same flaw.

No human-made formal system appears to be capable of self-reflection; yet any human can do it.

Why are our brains "made" in such peculiar way? As if there is a blind spot which prevents us from seeing the source of light. Will we be able to develop a workaround this spot? If the answer is yes, what sort of "language" will be used for such purpose (i.e. the formal system for which Goedel proof does not hold)? Will it be a "language" at all? What is, after all, the definition of the concept of "language"?

Tuesday, January 06, 2009

Continuing the translation-related stream of thoughts...What is, ultimately, the purpose of translation? According to Hofstadter, a thought could be seen as the specific itinerary via the (sub)network of active symbols (with all traffic scrupulously recorded). Along with this, we have to realize:
- there is no other network like this one;
- the author of the thought (be in in the form of a novel, a piece of poetry or even a phrase) does not necessarily perceive in all detail what was the purpose of all this and what the implications might be;
- ultimately, if to take this utterly seriously one will find himself / herself in the situation of S.B.Odin from Strugatsky's book 'Monday begins on Saturday', who, as we remember, knew how to perform every possible miracle, but was not able to really do anything, because it was never possible to satisfy all conditions!

Therefore, we can never expect the translation to be ultimately perfect. It simply cannot be, because no two human beings possess the same symbol network. The interesting question therefore is, what is the threshold of the information loss. How much information can be preserved when transmitted from one human being to another? How much information should really be preserved? (What if the noise is part of any thought, like virtual particles cloud is part of any "real" particle - sure we don't have to translate the noise?..)

Monday, January 05, 2009

Reading Hofstadter's GEB, I've come to the point, where he cites "Jabberwocky" in English, German and French and compares the beginnings of the three translations of "Crime and Punishment" into English. This, and subsequent episodes, have provoked some response in my symbol network, which I am going to try to pin down.

First, regarding Hofstadter's remark of the French translation of the word "slithy" ("lubricilleux"), he wonders if using the Latin-based word where its analogue in English is non Latin-based would trigger the additional sense of "alienness" of this word in the French reader which was not the intention in the original. I am not the ultimate expert of this subject but I have the impression that the French perceive the words with Latin roots simply as their own (French, after all, is a Roman language while English is not). Therefore, the translator's choice was probably quite fitting one. It is the fact that French has so many words in common with English (first, thanks to the fact that the English originally also used to be a Roman colony, and second, because the English have been conquered by the French later on) which obfuscates the difference between these two languages, but I wonder why Hofstadter did not mention it (could it be that he did not meet many French people at that time? ;) )

Second, regarding the Dostoevsky's translation. In the first phrase, there is a name of the street (S.Pereulok, abbreviated from Stolyarny pereulok). One of the translators just left S.Pereulok as it was, the other one changed it to S.Lane, and the third one called the little street "Carpenter's Street". Interesting that Hofstadter is against the translation No.3, because to him, it makes all thing sound like one of Dickens' works which could take place in London and in this case, why read Dostoevsky, asks Hofstadter, instead of reading Dikkens who is the ultimate translator of the same ideas into English?..

This is an interesting point. Could it be that different persons would require different flavors of translations for their minds to accept them as most suitable? The one example which immediately surfaced to my mind is the Russian translation of "The Wizard of Oz", which, as almost any Russian speaker knows, is called "The Wizard of Emerald City", has six follow-ups which wander far, far away from the original story (which already has been altered "to get accepted by the Russian children better"), and which is, as far as I know, far, far more popular than the more accurate translations of the original.

From the other side, the most popular Russian translation of Alice in Wonderland is not the one of Nabokov, who tried to do similar trick, substituting English realities with the Russian ones, but either the one of Boris Zakhoder, who preserves the "Englishness" of the book but does makes all the word plays and puzzling paradoxes comprehensive, so to say, or the one of Nina Demurova (who is assumed to be, in any case, the best translator for the little "Jabberwocky" piece)...

Thinking about the connotations and associations which exist in one language and do not exist in another looks per se both puzzing and endlessly interesting, but there are lots of traps where it is so easy to fall. For example, a simple English phrase "one another", at the first glance, seems to be literally translated into Russian as something like"friend-a-friend" (друг друга). But then, the word "друг" here can be an abbreviation of the word "другой", which literally means "other, another". But then again, isn't the word "другой" related to the word "дорогой" (dear), by the rule of making longer Russian words from the shorter Ancient Russian / Church Slavic words? Anyway, the words "friend", "another" and "dear" seem to be related in Russian, if only by the power of alliteration and pseudo-ethimology. What a nice thought can be drawn from this fact - that in the language itself, it is encoded that all the other persons are, by default, our friends, and they are, by default, very dear to us! (Incidentally, the word for "enemy" (враг, ворог) seems to be close with the word for "doing (evil) magic" (ворожить) - which can, if one is in the proper mood, be explained as that our only enemies are those who are doing the magic. It actually fits, if one takes the definition of magic as the attempts to overturn by the force of individual will the laws of nature - such persons could be indeed potentially dangerous!)

Enough for today.

Sunday, January 04, 2009

В кои-то веки я наконец узнала, что лопух - это то же самое, что репейник. Для меня всегда это были два разных растения!

Википедия - это сила :)

Friday, January 02, 2009

Текст, переведенный с другого языка (если только переводчик - не Переводчик с большой буквы), отличается от текста, созданного на этом же языке, как посмертная маска - от работы, сделанной скульптором (даже если он - разгильдяй-недоучка) с живой натуры.

Поэтому я многих современных англоязычных фильмов и книг в русском переводе не воспринимаю. Хорошие переводчики, конечно, никуда не делись. Просто плохих стало чересчур много.

Кстати, случайно попался на глаза неплохой (англоязычный) пост с рассказом о переводе и переводчике. Там даже упоминается соответствующая книжка Хофстадтера, которую, кстати, сам автор считает гораздо более важной, чем ГЭБ.

Когда мы наконец додумаемся до технологии более-менее адекватного перевода, мир здорово изменится. Возможно, это даже произойдет в ближайшие лет 10 - если, конечно, человечество не бросит свои силы на очередную глупость вроде мировой войны. С другой стороны, предыдущей мировой войне мы обязаны возникновением компьютеров... Но лучше бы все-таки без катаклизмов. А то и разговаривать будет не с кем.