Showing posts with label google. Show all posts
Showing posts with label google. Show all posts

13 Jun 2012

Rails i18n translations in Yaml: translation tool support

With Rails 2.2 the i18n API was introduced with a new method for translations.  Instead of embracing the venerable gettext which had been the previous standard, the Rails team invented a new way using Yaml files.  The result is a particularly graceful, flexible and very Rubylike way of specifying translations.  It also is much more reliable than gettext, which had many inscrutable issues with locales and caching, and sometimes caused people to get things in the wrong language.  So: bravo, great job.

But to do this, they specified their own translation format, the very flexible Yaml file. There are already a lot of formats floating around, and translation tool vendors and open-source translation developers have been working for a long time on conversion tools between them.  The Translate Toolkit and Pootle emerged from South Africa (a country which groans beneath the weight revels in the glory of eleven official languages) which provide an excellent web-based tool for collaboration, centered around gettext PO files.  However, poor little Pootle started a migration from Python to Django, and we all know how rewrites go.  [Halfway. Badly.]  But Translate Toolkit supported a lot of formats:

  • moz2po - Mozilla .properties and .dtd converter. Works with Firefox and Thunderbird
  • oo2po - OpenOffice.org SDF converter (See also oo2xliff).
  • odf2xliff - Convert OpenDocument (ODF) documents to XLIFF and vice-versa.
  • prop2po - Java property file (.properties) converter
  • php2po - PHP localisable string arrays converter.
  • sub2po - Converter for various subtitle files
  • txt2po - Plain text to PO converter
  • po2wordfast - Wordfast Translation Memory converter
  • po2tmx - TMX (Translation Memory Exchange) converter
  • pot2po - initialise PO Template files for translation
  • csv2po - Comma Separated Value (CSV) converter. Useful for doing translations using a spreadsheet.
  • csv2tbx - Create TBX (TermBase eXchange) files from Comma Separated Value (CSV) files
  • html2po - HTML converter
  • ical2po - iCalendar file converter
  • ini2po - Windows INI file converter
  • json2po - JSON file converter
  • web2py2po - web2py translation to PO converter
  • rc2po - Windows Resource .rc (C++ Resource Compiler) converter
  • symb2po - Symbian-style translation to PO converter
  • tiki2po - TikiWiki language.php converter
  • ts2po - Qt Linguist .ts converter
  • xliff2po - XLIFF (XML Localisation Interchange File Format) converter

In its heels, Google introduced the Google Translate Toolkit, which lets you use the Google Translate engine to suggest translations (based on its own databases or translation memories you provide).  It also does the core of what Pootle does: collaboration, access, but without crashing and flakiness, and it works with:
But neither of them supports Yaml files.  Unfortunately, tooling support libraries have not embraced this format in the intervening two and a half years.  I did find one solution: i18n-translators-tools which supports conversion between Yaml and gettext PO files, but it's somewhat idiosyncratic, and it turns out there's a good reason why there isn't a straightforward Yaml  PO converter: the PO format is consists of name-value pairs with metadata, and the Yaml format is a tree.

English source Yaml fileSpanish Yaml file produced by i18n-translators-tools from a PO file
page_info:

  sales/credit_notes:

    date: "Date"

    title:

      default: "Sales Credit Note"

      new: "New Sales Credit Note"
page_info:
  sales/credit_notes:
    date: "Fecha"
    title:
      default:
        default: "Sales Credit Note"
        translation: "Crédito de venta"
      new:
        default: "New Sales Credit Note"
        translation: "New Sales Credit Note"


There are some interesting things going on here: the Spanish Yaml file provides fallbacks so untranslated strings don't come through as blank.  The intermediate gettext PO file keeps the tree structure in the msgctxt metadata, and looks like this:

msgctxt "page_info.fuji_sales/sales_credit_notes.title.default"
msgid "Sales Credit Note"
msgstr "Crédito de venta"

msgctxt "page_info.fuji_sales/sales_credit_notes.title.new"
msgid "New Sales Credit Note"
msgstr "New Sales Credit Note"

So it's possible to use Google Translate Toolkit to translate your Rails Yaml files, provided you use the i18n-translators-tools library to do the conversions, and configure your Rails applications to support fallbacks.


28 Mar 2012

Second, Third, and Fourth-Order Effects of Social Marketing and Mass Securitization

Several years ago, Facebook founder Mark Zuckerberg crowed that he was able to use the database to retroactively predict with 33% accuracy with whom people would hook up a week later. This was widely viewed as very creepy (and was not spoken about again until recently) but you can guess that this was a dog whistle meant for potential advertisers. The advertisers have listened, and now Google is scrambling to catch up with Facebook on social search (and then advertising).
It’s impossible to get clear numbers on how well this stuff works. Even Facebook and Google probably have no clear numbers, but they certainly have clear enough indications. Google obviously has a clear enough indication to reform their entire company around this. So we can assume it is real. It all seems plausible enough, right?


So we can easily assume that this trend will continue, and that Google and Facebook will correlate increasing amounts of data on us, our friends, our coworkers, and the people we encounter, and will sell this data to advertisers who will essentially be placing bets on our behaviour. If there is a 27% chance that a given couple will marry within the next nine months, then there is a 14% chance that each of their closest long-distance friends will want to buy a plane ticket to the ceremony. Therefore, as an advertiser, you buy a tranche of ads for people whose out-of-town friends are soon to marry. The MapReduce job is an exercise for Google’s new Malaysian coding shop, the tranche is sold to the highest bidder via AdWords. Bada-bing, ca-ching.


As a second-order effect, this advertising activity begins to affect the behaviour of these out-of-town friends. A measurable jump in the number of people attending out-of-town weddings results, and the price on these ads consequently rises. Advertising grows markets all the time, so this is not surprising.


Now we emerge into science fiction-ville. An analyst-bot for a huge trading firm is trawling the AdWords marketplace, looking for interesting tranches for which the price has become overweight, and happens upon the out-of-town weddings advertising market, which is suddenly hugely oversubscribed. It pops up on the screen of a junior analyst (of the human variety) who clicks through to approve the creation of a out-of-town weddings futures market, which the trading firm then (automatically) proceeds to sell to its customers, and then (automatically) takes a short position.
An analyst-bot for one of the advertising agencies flags this new offering, and raises it to the desk of the (human) product manager for this market. She promptly buys into the futures market, betting that the market will rise. She talks to an executive VP and gets approval to buy a large product placement with a popular television show to feature a destination wedding as an upcoming plot. She does not get approval for a proposed contribution to a PAC formed by the National Organization for Marriage, as the VP is gay and cites the growing market for same-sex weddings.


Of course, this assumes that the securitization of everything will continue apace. Certainly there has been no progress in stemming the tide, and I don’t expect it to happen (barring a bloody worldwide insurrection against the dominant economic order).


What are some other examples of the weird things that could result from social marketing combined with this level of financial automation?
  • A new global baby boom triggered by businesses embracing new market development, caused by an algorithmic storm of projected demand for diapers, crude oil, softwood lumber, and manual labour. [The whole thing is triggered by a rounding bug in an Excel spreadsheet.]
  • Investment banks engage in wide-scale manipulation of tampon supply futures indexes by using sponsored advertisements to influence birth control method preferences so that women favour Depo-Provera over oral contraceptives.
  • The Corrections Corporation of America gets into a bidding war with Indian defense contractors on a cheap-labour-supply futures index, which is based on the relative probability of incarceration due to attempted drug sales by American teens.  The Indian defense contractors are shorting this to offset their own risk (due to the effect of rural broadband penetration shortfalls on the gold mining talent pool), and the market becomes very volatile.  To ease this situation, the CCA makes a large automated contribution to a tough-on-crime SuperPAC.
  • Asperger's patients become a new hot dating commodity, as their profiles are moved to the top of the activity ranking by social networks who wish to boost their visibility to advertisers who are bidding extremely highly for their ad dollars.  Social networks optimize their users lives to improve their value to advertisers.  This results in nerds getting laid a whole lot more, and lots more little Asperger's-prone nerdlings (who have truly wonderful advertising potential).
So just remember kids, just because you don't click on those ads in Facebook doesn't mean that those ads aren't clicking on you. And with Google+ and Facebook embedded in every single webpage, you can run, and you can hide, but you cannot avoid being aggregated, and those aggregations will be monetized until they control your every move. Resistance is futile.



Re-reading this hours later I realized that what I'm describing here is a much less rosy portrait of the same technological trends outlined by Bruce Sterling in his seminal short story Maneki Neko back in 1998. Except of course his story has excellent characterization, plot, and narrative drive.

8 Nov 2011

AGPL revisited: how MongoDB licensing differs from MySQL

Now that the Affero General Public License (AGPL3) is actually being used by successful projects, I'm looking at it again. Specifically, MongoDB is AGPL3 licensed, and it is being used for commercial applications. But how?!? I though the AGPL was complete communism, and that's what excited me so much about it - one touch of the the brush, and the whole batch of milk is stained vermillion, and your entire enterprise now belongs to Richard Stallman so he can use it to fund GNU HURD.

The AGPL actually has some pretty fixed boundaries:
A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate.
Upon reflection, the AGPL isn't as restrictive as I once thought. Let's take what I consider to be the most successful GPL (v2) product: MySQL*, and consider what would have happened if it had been released under AGPL instead. Since Amazon used MySQL code to build RDS, under the AGPL Amazon would be forced to release the code they use to provide the RDS service. They would not be forced to release the code for Amazon.com** however: that would clearly be outside the boundaries set out in AGPL.

Also consider that Facebook uses MySQL internally, with something like 4000 MySQL databases to power much of their site, and they've made many changes to MySQL in order to make that possible, some of which they've made public. If MySQL had been AGPL-licensed, they would have been required to make those changes publicly available under the same license.

Google is also reportedly one of the largest users of MySQL, and in a similar spirit they have released some of their tools. However, they released these tools under the more permissive Apache 2.0 license: if MySQL had been released under AGPL3, Google would most likely have been forced to release these tools under AGPL3 as well.*** And now that Google is also offering Google Cloud SQL made with GPL-based MySQL, they don't have to share their work as they would if MySQL were AGPL3-based.

All of this to say: if you want to use MongoDB to power a web app, have fun: the boundaries within the AGPL3 are there to help you, and probably won't require you to hand over your code to every visitor. However, if you see MongoDB and think "hey, that's cool, I'm going to offer a web service with the MongoDB API and become a cloud provider of NoSQL data storage, just like Amazon SimpleDB" then you will have made a derivative work, and you'll have to share those changes with the world under AGPL3.

Finally, IANAL, not in any jurisdiction, and if you base your legal strategy on lay analyses found on personal blogs, then sadly you're not alone and you're in very risky company. Best of luck, however, in finding a copyright attorney who will dig through these issues for you and give you an opinion for less than $500k.



* The Linux kernel is more widely used than MySQL, but it's so mixed up with other licences that it can't just be GPL anymore, not honestly - and the copyrights are owned by so many different people that nobody can claim ownership. MySQL, on the other hand, was always extremely diligent about maintaining ownership of every line of code they include in their distribution (which made acquisition by Sun and Oracle all the more attractive).
** ... that is, provided Amazon.com was built using MySQL, which it isn't AFAIK.
*** They could still licence their code any other way they want, as they own it, but they'd be required to license it under AGPL3.

7 Jul 2010

Will HTML5 make app stores obsolete? Don't count on it.

HTML5 is a lovely platform for cross-device development. Basically, it's the only game going forward. But it's really not an answer for building a great app for a given platform. Apple is talking up HTML5 in order to combat Flash, but it's just talking about web sites, not apps. HTML5 will rule the [moribund] desktop, but for mobile devices I think it has major challenges.

HTML5 does not get the same level of access to the device that you need to build a rich experience.
  • Integration with the contact list (is there anything really more important?)
  • Access to phone status, history and actions
  • Camera(s), proximity sensor(s), microphone(s), accelerometer, compass, gyroscope, multi-touch, speakers, etc (and a lot more to come)
  • Local storage, access to SD-card files, application backup and restore
  • Native configuration and management interfaces (sync, preferences, phone migration, privacy, network gsm-vs-wifi, etc)
  • ... drumroll please: the app stores. This is the channel for getting apps for these devices. Otherwise they have to find your website somehow.
HTML5 apps will be good enough in some cases for all devices, but they'll always be step-children to the native environment. You could argue that we just don't care about the weird sensors and whatever else HTML5 doesn't give us access to. I disagree: the really useful apps for mobile platforms will take full advantage of these features, recording and correlating all sorts of information and drawing conclusions from it - where you are, where your customers are, when you're together, what you're carrying with you, how your spouses are getting along, what you've sold them, transcripts of your conversations, your body temperature, what they've bought recently, voice stress analysis, who else is around, what's mouldering in their warehouse, and what expression is on your face. Science fictional? Sure. "The future is already here, it's just not evenly distributed." Mobile technology is going to level the playing field for these kinds of intelligent applications.

HTML5 will continue to evolve and will slowly add access to mobile functionality common to all devices, in a lowest-common-denominator way. The fact that Webkit will be on Blackberry by the end of the year makes HTML5 a cross-browser contender - it will lock up the entire mobile landscape, making cross-platform browser apps even easier. But so far, geolocation is one of the few things that work cross platform. The full list of things above will come over Steve Jobs' dead body. [I'm only half-joking.] PhoneGap is the only cross-platform development environment that currently has any viability at all, and it's risky because Apple routinely rejects PhoneGap-based apps; although they're written in JavaScript which is technically allowed, Apple takes a dim view of anything not *originally* written in Objective C. I don't expect Apple to be changing direction and opening up their platform and their store. If RIM survives [fat chance] its app store may go in a different direction - but I'm not holding my breath: RIM is completely beholden to [evil] carriers.

The app stores are large and getting crowded. But the publishers, labels, studios and carriers are in bed with the Google and Apple markets, and they have real legs. The markets are evolving extremely fast, especially Google's (which is moving into music and movies, and even has meta-markets like AppBrain) and Apple is s-l-o-w-l-y migrating to a non-desktop iTunes store. The smartphone market is exploding, and every one of these devices has an icon on the front screen for the app store. I don't expect these stores to go away any time this decade - there's just too much money to be made.

19 Jan 2010

Google Docs lets you upload any file! Really? No, not really.

I decided to give it a try. Sounded cool.


Uh, ok. That doesn't make much sense. Is the limit 250MB or 1MB? Or what? I guess I'll look at the help.

So tell me, how does this reconcile with "Upload any file"? Not a great experience here. Google, I'm disappointed.

18 Jan 2010

Rogers tells HTC Dream users to turn off GPS or 911 calls won't go through

On January 15 I received an SMS message from Rogers telling me I'd better disable GPS on my phone or I wouldn't be able to make 911 calls. This is the latest chapter in the unhappy saga of the HTC Dream on Rogers.
Rogers/Fido service message: URGENT 911 Calls: Please disable GPS location on your HTC Dream device to ensure all 911 calls complete. HTC is urgently working on a software upgrade and we will provide details shortly so you can re-enable GPS.

Instructions: Select Menu - Select Settings - Select Location - Uncheck Enable GPS Satellite

Message de Rogers/Fido : URGENT - Appels 911 : Veuillez désactiver la localisation GPS sur votre appareil HTC Dream afin de vous assurer que tous les appels 911 soient acheminés. HTC développe le plus rapidement possible une mise à jour du logiciel et nous vous fournirons les détails sous peu afin que vous puissiez réactiver la fonction GPS.

Instructions : Sélectionner Menu - Sélectionner Paramètres - Sélectionner Location - Désactiver les satellites GPS
First Rogers announces that they're not providing any more upgrades to the software on this platform. Then they announce that they'll upgrade Dream users to the HTC Magic for free (well, with a contract extension). Then the damn thing just doesn't work. Ah, the joys of early adoption...

I just want an Android device with a keyboard. Is that too much to ask?

19 Aug 2008

Skating circuit around downtown Vancouver

This evening I did a circuit around downtown Vancouver on skates: 13km round trip from our place; it took me 75 minutes. The route is designed to avoid hills as much as possible: the only steep hill I couldn't avoid was the Main St. bridge.

I used the Google Distance Measurement Tool to create this map, but there was no way to save it.
Downtown Vancouver Skate Circuit: capture of Google Map

Street-by-street summary

Seymour St, Helmcken St. (future greenway!), Richards St, Beach Crescent, Seawall, Carrall St Greenway (beeyoutious!), E. Cordova St. (baaad neighbourhood), Main St (gorgeous view from the bridge, which has a steep decline for inline skates), Waterfront Rd. (which goes under the SeaBus bridge, Canada Place and annex), dive through a parking garage to get to the Coal Harbour Seawalk. At this point you can stay on the seawall, although I use the side streets (Cardero St, Bayshore Dr, Denman St.) because the paving stones are a little too bumpy for my taste. This takes you to Stanley Park. You can go around the Stanley Park Seawall, which is beautiful (another 9km), but I chose to take the bike trail through the Georgia St pedestrian tunnel and along the shore of Lost Lagoon, and follow the path though the bike tunnel under Stanley Park Dr. to Second Beach. From Second Beach the bike path is very narrow and shared with pedestrians, so look out... but continue along the (beautiful) seawall until the end of Sunset Beach where you reach the Vancouver Aquatic Centre. From there, I recommend taking Beach Ave back to Beach Crescent – which is a full loop.

4 Aug 2008

Google's quality problem

Google's service used to be of the highest quality. As the company has grown, it maintained that quality – despite their famous eternal "betas", their work was actually very reliable.

I'm starting to see quite a lot of exceptions to that. I saw my first Google server error a couple of months ago (after eight years of using it), and since then I've seen a lot more.

Recently I decided to try Google Knol. Whoops... can't verify my identity:

Screenshot-502 Server Error - Mozilla Firefox

Then Blogger went psycho, out of the blue while scrolling down my blog page:

Google goes south, yet again

And today I saw an article about Google Translation Center. I thought "hey, I hope this works..." and sure enough, it doesn't let me sign up:

Sign up for Google Translation Center fails

What's going on? Google is trying hard to grow its business beyond search-based advertising revenues, and at the same time it is trying hard to drive traffic which results in advertising revenues. As they do this they risk diluting their famous brand with low-quality attempts to solve new problems.

3 Oct 2007

Sayōnarā, WebEx

I have cursed WebEx for years:
  • every time I waited ten minutes for the crappy ActiveX control or equally crappy Java applet to (fail to) load

  • every time desktop sharing loaded but showed nothing

  • every time I struggled to export a powerpoint document into its proprietary Universal (?!?) Communications Format with its Powerpoint plugin that never worked
why all that frustration? Just to control what page people look at during a powerpoint presentation, for the most part. Sometimes, rarely, for showing them an actual live application.

As with anything that truly pisses me off, I was once a fan. For one incandescent second in 1999 Webex was cool. But they never improved a damned thing. And fickle me, I've found a new shiny thing: Google Docs Presentations. For creating presentations it isn't much – you'd better not want more than bullet points – but for showing slides to others? Oh, bliss... just fire up the presentation and send the link to the attendees. So create your presentation in KeyNote, PowerPoint, or OpenOffice Presentation, save it in PowerPoint format, then upload it to Google Docs, and you're set. It is a beautiful thing. Bye-bye, WebEx, it was fun for a while.

15 Sept 2007

Maps for the rest of the world

Google Maps now includes map data for 54 more countries, including many in Latin America -- and of great interest to us, Mexico.

Years ago we bought a street map of Veracruz. The maps are terrible... Mexico does not have a great tradition of map use. Mexico City has the excellentGuia Roji (as do Guadalajara and Monterrey), but the rest of Mexico gets by on blurry, out-of-date maps. If you could even identify minor streets it was a major victory.

Here's our neighborhood in Veracruz, Colonia 21 de Abril. Our house is at Florencia Veyro 264, between Echeven and Sánchez Tagle, just above the "V" in "Veyro".


Apparently the map provider hasn't put street number information into these maps, and driving/walking directions are not yet supported. But even so, this is a wonderful thing. Hooray Google!

15 Apr 2007

DoubleClick on Google

Old news already, but Google's purchase of DoubleClick is a pretty big deal. Google already owned most of the online ad revenue, and now they own the rest of it. Granted, their evil quotient just went up again, but I guess everybody cashes in sometime. I can understand how Yahoo! was outbid, but Microsoft still has $29 billion in the bank. Ballmer says advertising is important, but he seems incapable of doing anything about it. I suppose he's counting on his geniuses to invent something new. So far, they sell ads on MSN &ndash like, wow. Good luck with that, chump.

But honestly, what is the deal with Ballmer? Last time it was YouTube:
I am surprised that Google would pay $1.6 billion for it.
No. I'm not saying it is overvalued. I'm not trying to say that. It depends on a set of factors. I'm not saying I wouldn't write a check for that amount of money. I might.
Oh yeah? Well, buddy, if you can't innovate within your company, then you'd better stop dithering and buy something.

9 Jul 2006

When buildings collide

Google maps has satellite images that are cleverly stitched together. But due to parallax, sometimes those images don't quite line up exactly as they should. It would be fascinating to see the image stitching algorithm.

3 Jul 2006

Google takes on payment processing

Analysts are billing it as a "paypal killer" but I think that's off the mark.

Being me, I have to search for an apt analogy: if this is a PayPal killer, then mammals were a coelecanth killer. Which is to say: I think Google has a bigger target in mind than Paypal, which is a small piece of the pie (which everyone hates anyhow). Instead they're taking on the banks, First Data, and (since the acquisition of Verus) Sage.

Let's see, add together Google Base, Google "office" (gmail, spreadsheets, etc), and now Google Checkout? That's starting to look like an ERP or NetSuite-type solution pretty fast.

And now, a cautionary tale:

In 1974, IBM created SNA (the Systems Network Architecture). They built something with the ultimate depth of (mainframe) functionality in preparation for the explosion they saw coming in computer networks. I picture the Big Bluers sitting around a conference table in Poughkeepsie, chainsmoking Pall Malls and saying, "by gilly, someday there could be as many as a thousand machines networked together! We must make sure we defend IBM's mainframe market share in that environment!"

SNA has disappeared from view. Sure, there must be a couple of SNA networks out there... coelecanths. TCP/IP and other smaller, more flexible network stacks were what carried us to where we are today. I once read that OSI (another dead network protocol stack) was a "mammal designed by a saurian committee."

When the climate changes species either mutate or become an evolutionary niche player. Reproduction doesn't cut it anymore.