English & The Digital Gap
Jeff Atwood at “Coding Horrors” writes that it is reasonable to require programmers to speak English; most of his commentators agree.
Jeff writes that:
Consciously choosing to switch from Polish to English reminds me why I gave up Visual Basic for C#, as painful as that was. These languages do exactly the same things — and the friction of choosing the minority language was severe.
So here there is a little of that ugly American, someone who never needed to learn another language. People may be keen on their first programming language, but it is nowhere comparable to a native (human) language. Nowhere. Learning foreign language to a point were you can read technical writing takes years. Writing in a foreign language would always cripple you, unless you’re super-talented, or you got to live in an English speaking environment on a young age.
Learning a new language is not like learning a new programming language. If anything, it is comparable to mastering how to program.
And so English becomes a – or maybe the – major obstacle in closing the Digital Gap.
A couple of years ago I taught at Tech-Careers. It’s an IT training center for Ethiopian-Israelis, a group that experiences immense hardship in integrating into the Israeli society. While my students were all high-school graduates, they had but rudimental English. Basic material was available for them in Hebrew (you can find HTML, JavaScript, C# or ASP books in Hebrew), and we encouraged them to use Hebrew programming Q&A forums. But only as much is available. It was frustrating to see them struggle with English online material, and it was discouraging to imagine how English would ultimately hamper their employment opportunities.
We offered them help in improving their language skills, but in an already overloaded program, we could hardly make it a priority.
Yes, it is certainly a very difficult, almost impossible, to be work in software in Israel without a working knowledge of written English. It’s not just the learning material that’s missing. Almost all high-tech companies are geared towards export, so unless you work in-house for a traditional company (I am thinking a bank), you will have to write documents and very possibly interact with people in English.
But do realize that it comes with a price.
Add comment March 31, 2009
Podcasts with Python
Back in the first days of computing, I am told, it was thought that people would write their own software to answer their specific needs. It’s easy to ridicule this optimistic notion in a world where desktop applications took zillions of developer hours. Still it turns out that sometimes what’s available off-the-shelf just don’t meet the needs, and if you are a programmer, you are tempted to attempt your own tailor-made, if quick-and-dirty, solution.
And that’s exactly how I ended up writing my own Python script for downloading podcasts.
I am not sure how other people manage their podcast subscriptions. I never owned an iPod and never used iTunes, so I can’t rule out that the bold and the brave of Cupertino came up with something smart. I did try Juice and Ziepod and Doppler and some others. None was quite right.
So I wrote my own python script for that. It’s Here. I can’t imagine it would be very useful for other people out-of-the-box, though you may try. Hopefully it’s nicely documented so you can scrap the code for other purposes. Let me know if you find it useful andor if you have comments.
The script uses two hugely useful python modules: EasyGui and Universal Feed Parser. I’ll try to put together some lines about both in the furutre.
And since we’re talking about podcasts: the Hebrew meager tech podcasts scene recently received a reinforcement with the new “Reverse im Platforma” (Reverse with a Platform?) podcast by Ran Tavori and Uri Lahav. Naturally, it would be hard for them to make all content equally interesting for the diverse software development community, but I really recommend giving it a go.
Add comment February 19, 2009
Moving my SVN repository to xp-dev
For the last two years or so (I think), I have been keeping my code version-controled at assembla.com.
My code – that is, the code I write at home, most significantly my school-related code, but also more important stuff such as my never-ending Taki clone project.
Now please don’t think version-controlling my personal projects means I am organized. Version control is a cool thing for the cooly non-organized people who make a mess of their code often enough to swear in the name of the revert command. Like me! And of course remote version control also means back-up against hard disc issues.
So I chose SVN because it’s free and open-source and I know it from work; well, I know it from work because I installed it at work after years they have been using SourceSafe. And I chose web-based because I don’t feel like maintaining a local server installation, plus there’s the part where you can access it from anywhere and the remote backup thing. And I chose assembla because it looked good at the time.
Except I have just received a mail saying that they are going to cancel their free program (apparently they have announced that some time ago, but somehow I missed the message).
So I quickly Googled an alternative solution and found xp-dev, which seem to do everything I want. Signup and migration took about 10 minutes. There are short instructions here, except (by now?) you can import the repository dump through the web-interface, without emailing the support.
So far so good, but it’s only been a few hours…
(Edit: I came across this post that linked to this interview with the guy behind xp-dev. You might be interested in reading it. After all, you’re going to trust your dear code with this guy).
(Edit: apparently the spell checker of my Windows Live Writer was off when I originally wrote this post. I hope with the help of Firefox’s speller things are better now).
Add comment January 2, 2009
Wordcamp IL 2008: Your Next Web Site Will Be a Blog
Ultimately, the reason I went to Wordcamp was Hanan Cohen’s talk “Your Next Web Site Will Be a Blog”. It was insightful, which is not surprising as Hanan strikes me as one of people who know that most about the place where NGOs meet the web.
The message was that blogging platforms (specifically, WordPress) can be used to build a pretty web site quickly and cheaply by non-techies (anyone who can open a Gmail account and send an email with an attachment, to use Hanan’s words).
I have a little experience with NGOs web presence. It’s complicated, expensive and frustrating for everyone involved. Starting a blog on a web site like Blogli or WordPress.com is cheap (free, in fact), but more importantly, you work gradually. You start with one page (that is, one post) and see how things evolve. You don’t have to make complex decisions or come up with a lot of content up-front.
A blog can actually predate the organization. 3 people with an idea and no money can start a blog as one of the first steps to create and organization.
Or even if the NGO already has a web site, a blog may be used for one event or project.
I think the core of the message was that the NGO people can and should do that themselves. This way they can control, update, fix their blog themselves, or even get rid of it if it doesn’t work. Crucially, this way they learn to understand the medium better, to the profit of their organizations and to their personal profit. They will come better equipped to their next “full” web site construction project.
I am pretty convinced myself. Maybe less so for an organization with an existing site, unless it’s totally crappy: there’s a value in keeping all the organization’s information in one place, rather than starting confusing people (and Google) with multiple addresses. Certainly as a first step towards web presence.
2 comments November 16, 2008
Why Google’s English Translation is Better
(This is an English version of my original Hebrew post that appeared here).
Google Translate was launched for 11 additional languages in September, including Hebrew. Playing with it a little, you would soon notice that translating from Hebrew to English yields much (much, much) better results than attempting the reverse direction. If you read some Hebrew, there’s a pretty typical example here, but you can produce any number of examples simply by trying to translate virtually anything.
What’s going on then? Why is the translation into English legible, even usable, while the translation into Hebrew sums up to complete and total rubbish? (Especially as intuitively, as the Hebrew writing system is problematic and ambiguous, I think it should be harder to translate from Hebrew). So here’s what I think the answer is.
Google’s translator “learns” to translate using two kinds of sources. The first is a pool of translated texts, that is texts that were written in one language and translated into the other. The other type is a pool of texts in the target language.
The translations pool is used like a bilingual dictionary, only better. As you would look up an entry in the dictionary in order to translate it, you can search the word in the pool of translations and see how it was translated before. Plus you have the advantage of being able to use the context to choose the best translation for the current case.
For example, in Hebrew one word is usually used for “search”, “look for” and “seek” (לחפש – lehapes). If you have to translate “lehapes” from Hebrew to English using a dictionary, you would find all these options. But if you search “lehapes be-Google” in the translations pool, you would find “search” to be used in this context.
The target language pool has two functions: one is to help in selecting the correct translation, and the other is to assist in constructing a reasonable target language sentence: changing word order, matching the gender and number of the subject and the verb, etc. Basically, the idea is to translate the source text in any conceivable way, and test which translation best matches what we see in the target language pool. For example, if we come up with “generality of elections” and “general elections”, we can assume that in most cases the second would be better.
Translations are hard to find, and translation pools are hard to build. The number of texts that were translated between any two languages is much much smaller than the number of texts written in any of these languages. We prefer avoiding archaic language, and we prefer texts similar to the ones the computer will later have to translate. So in software designed to translate web sites, we don’t really want to use Bible translations, or even Harry Potter. In short, this is a challenge, even for an information giant like Google.
It turns out, though, that if you have a really good pool of target language texts, it can offset for a small translations pool. I read somewhere about the following experiment: people were asked to evaluate some (human-made, in that case) translations. There were two groups of evaluators: bilinguals, who evaluated the translations having also read the source; and monolinguals, that were asked to evaluate the translation quality based on reading the translation only. The monolingual’s evaluations, it turned out, were very similar to those of the bilinguals. (My scientific conscious troubles me about not giving a citation for this, but not enough for me to start rummaging through papers and files. Do ask though, if you need it).
What we learn here is that a lot of the translation’s quality has to do with how the product makes sense in the target language. A good target language pool and a good way of using it to testing our translation may improve translation quality dramatically.
And that’s, I guess, where the difference between translation into Hebrew and into English lies. Google’s Hebrew texts pool would be way smaller than the English one. Moreover, I think it’s safe to assume that the way the use it to verify translation correctness isn’t quite as sophisticated. In other words, in the limited sense that software “knows” a language, Google’s “knows” way less Hebrew than it does English. Which is not really surprising, after all.
Add comment October 30, 2008
Debugging Memory Issues in Java
Having started this blog with 3 CMS-related posts, you may as well miss the fact that professionally, I am a Java developer.
So to remind you of that, I want to share something I found out about some time ago: a method for recording and analyzing the memory-related activity of a Java program, without a profiler. It requires jhat, delivered with Sun’s JVM, but on Windows only since Java 6.
Add comment October 27, 2008
Concerns about Drupal Release cycle
In previous posts I preached for Drupal, and I feel I should amend some concerns; the first and foremost is Drupal release cycle.
Drupal major versions are not compatible with each other. They “break the API”. All the contrib modules need rewrite (contrib modules are 3rd party plugins, and you use them extensively). Themes (templates, designs…) need rewrite.
Major releases come, in theory, every year or so. So you have to upgrade you site every year. For me this is a major drawback. I haven’t been through this yet, but I will probably hate it.
There are two major versions supported at any given moment (right now it’s 5 and 6). But you can’t really buy time by skipping major releases (e.g. moving from 4.7 directly to 6). Well, maybe you can, but it’s problematic. One reason is that the official upgrade process is only from the previous version, 5 in this case. More importantly though: you can’t really use Drupal when it’s first released, so there aren’t really two functional version supported at any time.
Drupal 6 (D6) was released on February 2008. At that day, the compatible Views module released an alpha version, and today, we only have a release candidate for Views. I wrote about View in a previous post: it may be a contrib module officially, but I don’t think there are many Drupal installations that don’t use it. And with more peripheral stuff, it may take months before you have a first compatible release. The celebrated Zen theme, for example, only released an initial D6 version in May.
I use D6 for my site, with many modules still in RC, beta or even alpha stages. It’s probably not best practice, and I wouldn’t do it for an enterprise site. Even as such, though, I won’t be able to skip D7: the way I understand it, the day D8 is released, my D6 site won’t be supported anymore, but it will be many months more until I can actually replace it with D8 site.
Now in comparison, Joomla still supports 1.0.x, where 1.0.0 was released in 2005. It’s probably well worth upgrading to 1.5 by now. Yet however painful the upgrade is (I have no idea), once in 3 years it’s not unreasonable.
Drupal developers defend the release cycle fiercely (for example here), and I am not going to argue with people that do good work for me without requiring me to pay. However I think it’s a point well worth considering before you commit yourself to Drupal.
Add comment October 24, 2008
Drupal flexibility in four keywords
If you are into CMS choosing, you may have heard (for example in my previous post) that Drupal was flexible. And maybe you think, gee, that’s sounds cool, except I am not really sure what it means. So now I am going to attempt to write about four major flexibility aspects of Drupal, in a language you should understand even before you started the infamous Drupal learning curve.
A disclaimer. As I stated before, I moved to Drupal from Joomla, so it’s against Joomla capabilities, specifically Joomla 1.0.x capabilities, that I compare.
Starting with nodes, then. Nodes are Drupal content items. What’s great about them is that they can contain anything: including full HTML or PHP. I have an event calendar that uses a Google Calendar iframe. It’s just a normal content item that contains the iframe code I copied & pasted from Google. In Joomla, I had to install a component to do that.
Next keyword: blocks. Blocks are boxes that you configure using Drupal web interface. They can contain any type of content (again, HTML, PHP, and see also later about Views), and they can be placed in a bunch of different page location, on all the pages or some of the pages, for all users or some users. So here again you use the web interface to can create stuff that would require a module in Joomla. Back to the calendar example, I have got a list of upcoming events on my homepage. It’s PHP code, adapted from James Cridland, and put into a block that appears only on the front page.
Third, CCK, or Content Creation Kit for long. It’s a contrib module, which means a 3rd party plugin, but I don’t think there are many Drupal installations without CCK. With CCK, you can create new node types, or content types, and\or attach new fields to existing content types.
Here’s a fairly basic example. The content on my site comes from many authors, but there’s a single person responsible for putting it online (and that’s me). So while the actual author is Rabbi Stacey Blank, the author field can only contain something like webmaster. In Joomla, there’s a built-in solution for that: an author alias field. In Drupal? No problem, I just added the field to the relevant node types using CCK.
Less trivial example? Creating an e-shop without a specialized module, by defining a node type for products. With an e-shop module, your products will have the properties the module author came up with (name, price, maybe size, maybe color). With CCK, your products will have exactly the properties you need (nutritional value? scent? you name it). Or maybe you need something that just doesn’t have a module, a pets site, a recipes site, a knitting patterns site. With CCK, you can make them without writing code.
Last and certainly not least: Views. Views is yet another omnipresent contrib module. It allows you to query the content items database, using a nice UI rather than SQL, and then put the results in pages, or blocks, or feeds, or wherever. You can use it to create a “last updates” box, or a blog style monthly archive, or random image box. Again, it saves you the need of finding and learning specialized modules (not to mention finding out their limitations and learning to hate them). More importantly, it allows you to come up with original uses to suite your own needs.
As a wrap-up, what flexibility means to me, is that you can create something new, something that the developers of the CMS didn’t anticipate, without having to write code yourself. And I think Drupal does a very good job on that.
Add comment October 21, 2008
Why I moved from Joomla to Drupal
“Joomla or Drupal” seems to be a question the web asks itself quite often, these being probably the most popular open-source content management systems. So I thought I might add my perspective.
I am not a professional web developer, but in my spare time, and for the sake of the greater good, I have built and am maintaining a congregation web site.
I first moved to CMS-based site more than two years ago. Back then, I decided on Joomla mainly because people said it was much easier to master. I still think it’s true, nut it comes at the price of being much less flexible (which is a trade-off you could expect).
Just an example. Joomla has an extremely annoying categorization system. There are sections, which contain categories, which contain content. Every item must be in a category and every category must be in a section. I can’t suppose the people who designed Joomla were morons, but I do think it’s pretty weird that they thought all the content people would ever want to put into Joomla would fit into this scheme. This is not a very important limitation by itself. I worked with Joomla, I coerced myself and my data into this system, it wasn’t pretty but it worked. I am just saying, people who design like that cannot be expected to build a flexible system in general.
Drupal’s design is much more open-ended, which means things you could not do with Joomla, or you could not reasonable do with Joomla, can be done with Drupal. Also, things that would require you to tinker with your Joomla template or write a new component or module, are done using the web interface only in Drupal. I am going to write more about this in the future.
The other selling point, for me, was the strong support for multilingual sites in Drupal. With Joomla I had two installations to support bilingualism, which means two logins, twice the maintenance, and media files duplication. Building a multilingual website isn’t effortless in Drupal either (maybe I’ll blog a little about that too), but it’s possible and it’s worth it.
I have worked with Joomla for a long time and only migrated to Drupal recently, so I do fear some agonies are still to come. In fact I already have some rants that I will probably share with you at some point. But for the time I am very happy with the move.
2 comments October 17, 2008
