Mad Analogy: software

Showing posts with label software. Show all posts

21 Jun 2012

Rails test coverage: sometimes 100% is just right

DHH, the éminence grise of the Ruby on Rails world, took a swipe at the test-first cult with his provocative article "Testing like the TSA", saying in effect that 100% test coverage is mad, bad, crazy behaviour, worthless, and an overall affront to good taste and a crime against humanity. [I paraphrase.] Since we enforce 100% code coverage at all points through our development process, I want to explain how this does not necessarily make us time-wasting, genital-fondling idiots, how the needs of our business drive our quality strategy, and how this pays off for us.

At Sage our customers demand and deserve the best we can deliver. We are very quality focused because we build accounting solutions in which getting the right answer matters a great deal: perhaps some customers don't care about quality, but ours demonstrably do. Perhaps in some cases time-to-market is much more important than reliability or maintainability: it is a business decision, and there is no one-size-fits-all answer. However, if you're building for the future and want to avoid years of functional paralysis and a costly rewrite, building an application on a solid quality foundation makes a lot of economic sense.

Write less code

The most effective way to maintain 100% test coverage is by writing less code. We refactor like crazy, and we refactor our tests just as much as our code. We don't repeat ourselves. We spend time creating the right abstractions and evolving them. Having 100% test coverage makes it much easier for us to do this: it is a virtuous cycle.

We've been doing Rails development at Sage for five years now, and we've learned a few lessons. Even if you're writing unit tests with 100% code coverage, you're doing it wrong if:

Generators are used to build untested code (i.e. using the default Rails scaffolds to build controllers and views)
Partials are the most sophisticated method of generating views, and they look like PHP or ASP
The tests are harder to understand than the code

What is the alternative? Well, if all of the controllers and views look pretty much the same, factor them out. The Rails generators create enormous amounts of crappy, unmaintainable boilerplate code – every bit as as much as a Visual Studio wizard. On the other hand, if the controllers and views are each completely different and unique flowers, is it for a good reason or is the code just a mess? Chances are, if the code looks like a mess, so does the app.

In my experience it's also basically useless to attempt to retrofit unit test code coverage onto a project that doesn't have it: the tests that wind up written are always written to pass, and they rarely help much. I haven't yet seen a project that could be rescued from this situation.

Whom do you trust?

When DHH says that the use of ActiveRecord associations, validations, and scopes (basic Rails infrastructure) shouldn't be tested, he's claiming that Rails is never wrong: not now, not in the future, not ever. It's his choice to make that promise, but it would be irresponsible of us to believe it:

Rails changes all of the time. Sometimes there are even bugs! (Crazy talk, I know!) But active record associations and scopes are complex and ornery, and can easily be broken indirectly (through a change elsewhere in the code).
Because we operate on the Internet, new security risks and fixes appear constantly: zero day attacks are real. We need to react to these threats quickly, and being able to prepare and deploy new versions of our apps based on updated components immediately is crucial. Having a robust test suite makes it much cheaper and less stressful to implement these changes, which drives down technical debt and makes development more responsive, and oh yeah, helps prevent a costly rewrite.
We use components that extend and complement the behaviour of Rails. DHH calls out the example of testing validations to be particularly useless. Well, what about when the validations methods change in a rails upgrade? Or you want to adopt a new plugin that changes core Rails behaviour? Or you want to refactor an application to move validation to a more useful place? In all of those cases the tests on validation code would be useful.

Often this means a function in a spec mirroring a function in a model (but with enough difference in naming and syntax to be truly maddening). Yes, this feels stupid sometimes, but it is a very cheap insurance policy, and sometimes it pays off.

Time split

DHH says that you shouldn't be spending more than 1/3 of your time writing tests. This leads to a question: how are you characterizing your time? Is the person doing the implementation also the person making design decisions? If you are doing behaviour-driven development you are actually vetting the requirements at the time you write the tests, so is it a good idea to skip that part and move on to the coding? If you spend time refactoring tests to speed up the test process, should that be counted? Should the time spent writing tests before fixing bugs be counted? Have you decided to outsource quality to a bunch of manual testers? What is your deployment model? I'm reluctant to put a cap on the time writing tests. I find this metric as useful as dictating the time spent typing vs. reading, or the amount of time thinking vs. talking: my answer is not yours, and the end result is what matters.

Risk assessment

We enforce 100% test coverage because it ensures that no important line of code goes completely untested. One can decide to write tests for "important" code and ignore the "unimportant" code, but unfortunately a line of code only becomes "important" after it has failed and caused a major outage and data loss. Oops!

DHH avers that the criticality and likelihood of a mistake should be considered before deciding to write a test about something. However, this ignores the third criteria: cost. Is it cheaper to spend time deciding the criticality and likelihood of writing vs ignoring tests for every single line of code, or is cheaper to just write the stupid test and be done with it? Given the cost of doing a detailed long-term risk analysis on every line of code, does anybody ever really do it, or is the entire argument just an elaborate cop-out? The answer gets a lot clearer once you elect to write a lot less code, and it gets easier once you resign yourself to learning a new skill and changing your behaviour.

Closing

Code coverage is a great way to measure the amount of exposure you have to future changes, and depending on your business, it might be necessary to have 100% coverage. A highly respected figure speaking ex cathedra can be very wrong when it comes to the choices you need to make, and sometimes it shows. 100% code coverage may seem like an impossible goal, especially if you've never seen it done. I'm here to tell you it's not impossible: it's how we work, and in our case it makes a lot of sense.

13 Jun 2012

Rails i18n translations in Yaml: translation tool support

With Rails 2.2 the i18n API was introduced with a new method for translations. Instead of embracing the venerable gettext which had been the previous standard, the Rails team invented a new way using Yaml files. The result is a particularly graceful, flexible and very Rubylike way of specifying translations. It also is much more reliable than gettext, which had many inscrutable issues with locales and caching, and sometimes caused people to get things in the wrong language. So: bravo, great job.

But to do this, they specified their own translation format, the very flexible Yaml file. There are already a lot of formats floating around, and translation tool vendors and open-source translation developers have been working for a long time on conversion tools between them. The Translate Toolkit and Pootle emerged from South Africa (a country which ~~groans beneath the weight~~ revels in the glory of eleven official languages) which provide an excellent web-based tool for collaboration, centered around gettext PO files. However, poor little Pootle started a migration from Python to Django, and we all know how rewrites go. [Halfway. Badly.] But Translate Toolkit supported a lot of formats:

moz2po - Mozilla .properties and .dtd converter. Works with Firefox and Thunderbird
oo2po - OpenOffice.org SDF converter (See also oo2xliff).
odf2xliff - Convert OpenDocument (ODF) documents to XLIFF and vice-versa.
prop2po - Java property file (.properties) converter
php2po - PHP localisable string arrays converter.
sub2po - Converter for various subtitle files
txt2po - Plain text to PO converter
po2wordfast - Wordfast Translation Memory converter
po2tmx - TMX (Translation Memory Exchange) converter
pot2po - initialise PO Template files for translation
csv2po - Comma Separated Value (CSV) converter. Useful for doing translations using a spreadsheet.
csv2tbx - Create TBX (TermBase eXchange) files from Comma Separated Value (CSV) files
html2po - HTML converter
ical2po - iCalendar file converter
ini2po - Windows INI file converter
json2po - JSON file converter
web2py2po - web2py translation to PO converter
rc2po - Windows Resource .rc (C++ Resource Compiler) converter
symb2po - Symbian-style translation to PO converter
tiki2po - TikiWiki language.php converter
ts2po - Qt Linguist .ts converter
xliff2po - XLIFF (XML Localisation Interchange File Format) converter

In its heels, Google introduced the Google Translate Toolkit, which lets you use the Google Translate engine to suggest translations (based on its own databases or translation memories you provide). It also does the core of what Pootle does: collaboration, access, but without crashing and flakiness, and it works with:

AdWords Editor Archive (.aea)
Android Resource (.xml)
Application Resource Bundle (.arb)
Chrome Extension (.json)
GNU gettext (.po)
HTML (.html)
Microsoft Word (.doc)
OpenDocument Text (.odt)
Plain Text (.txt)
Rich Text (.rtf)
SubRip (.srt)
SubViewer (.sub)

But neither of them supports Yaml files. Unfortunately, tooling support libraries have not embraced this format in the intervening two and a half years. I did find one solution: i18n-translators-tools which supports conversion between Yaml and gettext PO files, but it's somewhat idiosyncratic, and it turns out there's a good reason why there isn't a straightforward Yaml ←→ PO converter: the PO format is consists of name-value pairs with metadata, and the Yaml format is a tree.

English source Yaml file	Spanish Yaml file produced by i18n-translators-tools from a PO file
page_info: sales/credit_notes: date: "Date" title: default: "Sales Credit Note" new: "New Sales Credit Note"	page_info: sales/credit_notes: date: "Fecha" title: default: default: "Sales Credit Note" translation: "Crédito de venta" new: default: "New Sales Credit Note" translation: "New Sales Credit Note"

English source Yaml file

Spanish Yaml file produced by i18n-translators-tools from a PO file

page_info:

  sales/credit_notes:

    date: "Date"

    title:

      default: "Sales Credit Note"

      new: "New Sales Credit Note"

page_info:
  sales/credit_notes:
    date: "Fecha"
    title:
      default:
        default: "Sales Credit Note"
        translation: "Crédito de venta"
      new:
        default: "New Sales Credit Note"
        translation: "New Sales Credit Note"

There are some interesting things going on here: the Spanish Yaml file provides fallbacks so untranslated strings don't come through as blank. The intermediate gettext PO file keeps the tree structure in the msgctxt metadata, and looks like this:

msgctxt "page_info.fuji_sales/sales_credit_notes.title.default"
msgid "Sales Credit Note"
msgstr "Crédito de venta"

msgctxt "page_info.fuji_sales/sales_credit_notes.title.new"
msgid "New Sales Credit Note"
msgstr "New Sales Credit Note"

So it's possible to use Google Translate Toolkit to translate your Rails Yaml files, provided you use the i18n-translators-tools library to do the conversions, and configure your Rails applications to support fallbacks.

17 Apr 2012

Homogeneous web development: Meteor, Derby, Firebase and the portents of doom

A variety of new web frameworks are being cooked up that allow you to write one set of seamless code for the client and server. It's a problem that has haunted the web development community since the dawn of JavaScript and the DOM. One approach is to basically define the database operations on the client. Does that sound like a good idea, or does that sound like a great idea?

Meteor

Exposes the MongoDB API directly on the client to work on automatically-synced data subsets. What could possibly go wrong? Let's name the project after a flaming ball of rock and find out for sure!

Derby

Is client-side MVC too confusing? Is Node.js too immature? Let's combine them and see what happens! (It remains to be seen whether Derby is named after a hipster hat or a county fair event.)

Firebase

"We have a full security system in the works that will allow you to control read and write access on individual locations in Firebase on a per-user basis. However, it’s not ready for widespread use yet, so right now all data in Firebase is publicly accessible. Please keep this in mind when building apps! Please contact us if you need security or want to be one of the first to try out the new system." ^*

Despite my scornful tone, I'm actually very optimistic on these technologies and very hopeful that at least one of these will be ultimately successful. I'm also really happy that I'm not going to be the first person trying build an application on this stuff. Given the theme of the project names, it's fair to say that most early adopters will get burned.

* Yes, that's a direct quote.

8 Nov 2011

AGPL revisited: how MongoDB licensing differs from MySQL

Now that the Affero General Public License (AGPL3) is actually being used by successful projects, I'm looking at it again. Specifically, MongoDB is AGPL3 licensed, and it is being used for commercial applications. But how?!? I though the AGPL was complete communism, and that's what excited me so much about it - one touch of the the brush, and the whole batch of milk is stained vermillion, and your entire enterprise now belongs to Richard Stallman so he can use it to fund GNU HURD.

The AGPL actually has some pretty fixed boundaries:

A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate.

Upon reflection, the AGPL isn't as restrictive as I once thought. Let's take what I consider to be the most successful GPL (v2) product: MySQL*, and consider what would have happened if it had been released under AGPL instead. Since Amazon used MySQL code to build RDS, under the AGPL Amazon would be forced to release the code they use to provide the RDS service. They would not be forced to release the code for Amazon.com** however: that would clearly be outside the boundaries set out in AGPL.

Also consider that Facebook uses MySQL internally, with something like 4000 MySQL databases to power much of their site, and they've made many changes to MySQL in order to make that possible, some of which they've made public. If MySQL had been AGPL-licensed, they would have been required to make those changes publicly available under the same license.

Google is also reportedly one of the largest users of MySQL, and in a similar spirit they have released some of their tools. However, they released these tools under the more permissive Apache 2.0 license: if MySQL had been released under AGPL3, Google would most likely have been forced to release these tools under AGPL3 as well.*** And now that Google is also offering Google Cloud SQL made with GPL-based MySQL, they don't have to share their work as they would if MySQL were AGPL3-based.

All of this to say: if you want to use MongoDB to power a web app, have fun: the boundaries within the AGPL3 are there to help you, and probably won't require you to hand over your code to every visitor. However, if you see MongoDB and think "hey, that's cool, I'm going to offer a web service with the MongoDB API and become a cloud provider of NoSQL data storage, just like Amazon SimpleDB" then you will have made a derivative work, and you'll have to share those changes with the world under AGPL3.

Finally, IANAL, not in any jurisdiction, and if you base your legal strategy on lay analyses found on personal blogs, then sadly you're not alone and you're in very risky company. Best of luck, however, in finding a copyright attorney who will dig through these issues for you and give you an opinion for less than $500k.

* The Linux kernel is more widely used than MySQL, but it's so mixed up with other licences that it can't just be GPL anymore, not honestly - and the copyrights are owned by so many different people that nobody can claim ownership. MySQL, on the other hand, was always extremely diligent about maintaining ownership of every line of code they include in their distribution (which made acquisition by Sun and Oracle all the more attractive).
** ... that is, provided Amazon.com was built using MySQL, which it isn't AFAIK.
*** They could still licence their code any other way they want, as they own it, but they'd be required to license it under AGPL3.

28 Nov 2010

That the Windows 7 phone could be worse is little consolation

I played with a Windows 7 phone at the Bell store (the LG Optimus Quantum). Although I wasn't repelled, I was puzzled. Things that should be fast were deliberately slow; navigation included pointless transitions that looked pretty the first time, but that I would soon get sick and tired of waiting to complete.

The device was so warm and heavy you could use it to give a hot stone massage. It is unsurprising that sales have been lukewarm. I was really hoping to see better from Microsoft, if only so that my mutual fund that depends on its performance would perk up a bit.

26 Nov 2010

OSX as a Ruby on Rails dev environment: Package Managers

A lot of Rails developers like to use OSX as their development platform. Although everybody hosts Rails apps on Linux (or Solaris under duress) lots of people love OSX for its productivity, clean interface, and most importantly, its typography.

However, as some have noted, setting up Rails on a mac is hardly a frictionless process. Unlike Linux distros, OSX has no built-in package manager; you get your version of OSX and you get your patches and you'd better like what you get, because every app is going to be updated when Apple or the vendor feel like updating it. This is the same as the Windows world, and it's ugly.

So a couple of efforts have stepped in to fill this void: MacPorts and Homebrew. Neither of these is going to feel like a complete solution if you're used to a package manager like APT or YUM, but they do at least automate the installation process for various open source packages. After all, when you want wget there's no reason you should have to find the website.

I'll start with MacPorts since that came first. MacPorts was inspired by BSD Ports; it is built in Tcl and C and contains a very complete set of available packages. It is quite popular and is the venerable incumbent. And personally, I hate it. I've had my OSX install ruined twice while using MacPorts, just by installing system updates; although I obviously did something wrong, it just isn't a robust solution. If MacPorts is the solution, I don't want to hear the question.

Another alternative is Homebrew, new Ruby-based system developed on Github. It has been around for less than two years, and it's a very active project with a lot of contributors. It stresses extensibility, and lots of recipes have been written to support various packages - predictably, those most popular with Rails developers. Although I don't think it solves the brittleness problem MacPorts suffers (it doesn't address operating system component and library version dependency issues) it is very actively developed, focused on the Rails world, and easily customizable to meet individual needs.

So, although you're probably not going to get set up with a Rails development environment with OSX as quickly as you would on Ubuntu (despite Ruby being included in Xcode), there are good solutions to keep you from pulling your hair all the way out. Which will bring you to the point where you can enjoy and appreciate the kerning on the fonts in TextMate as you write your Rails code.

11 Jul 2010

Adobe Flash: Just because Steve Jobs says it's bad doesn't mean it's good

Steve Jobs' self-serving Thoughts on Flash were controversial to say the least. Yes, he was hypocritical and self-serving (as usual), but he certainly wasn't wrong.

Adobe wants everyone to treat Flash as if it is an open standard, but they haven't made it open source. They made some parts of it open source, but not the parts that matter - and as a result, developers are constantly left wondering which platforms are going to work.

@cyanogen on Twitter: Flash doesn't work because it uses a native (non-portable) library which uses ARMv7 instructions. It can't run on older processors.

As a friend said, "Apple seems just as evil as Microsoft, just not as
successful. And Jobs seems even more evil than Bill Gates. Certainly
a bigger bastard." I totally question Steve Jobs' motives in wanting to crush Flash, but I don't think Adobe deserves a great deal of sympathy.

7 Jul 2010

Will HTML5 make app stores obsolete? Don't count on it.

HTML5 is a lovely platform for cross-device development. Basically, it's the only game going forward. But it's really not an answer for building a great app for a given platform. Apple is talking up HTML5 in order to combat Flash, but it's just talking about web sites, not apps. HTML5 will rule the [moribund] desktop, but for mobile devices I think it has major challenges.

HTML5 does not get the same level of access to the device that you need to build a rich experience.

Integration with the contact list (is there anything really more important?)
Access to phone status, history and actions
Camera(s), proximity sensor(s), microphone(s), accelerometer, compass, gyroscope, multi-touch, speakers, etc (and a lot more to come)
Local storage, access to SD-card files, application backup and restore
Native configuration and management interfaces (sync, preferences, phone migration, privacy, network gsm-vs-wifi, etc)
... drumroll please: the app stores. This is the channel for getting apps for these devices. Otherwise they have to find your website somehow.

HTML5 apps will be good enough in some cases for all devices, but they'll always be step-children to the native environment. You could argue that we just don't care about the weird sensors and whatever else HTML5 doesn't give us access to. I disagree: the really useful apps for mobile platforms will take full advantage of these features, recording and correlating all sorts of information and drawing conclusions from it - where you are, where your customers are, when you're together, what you're carrying with you, how your spouses are getting along, what you've sold them, transcripts of your conversations, your body temperature, what they've bought recently, voice stress analysis, who else is around, what's mouldering in their warehouse, and what expression is on your face. Science fictional? Sure. "The future is already here, it's just not evenly distributed." Mobile technology is going to level the playing field for these kinds of intelligent applications.

HTML5 will continue to evolve and will slowly add access to mobile functionality common to all devices, in a lowest-common-denominator way. The fact that Webkit will be on Blackberry by the end of the year makes HTML5 a cross-browser contender - it will lock up the entire mobile landscape, making cross-platform browser apps even easier. But so far, geolocation is one of the few things that work cross platform. The full list of things above will come over Steve Jobs' dead body. [I'm only half-joking.] PhoneGap is the only cross-platform development environment that currently has any viability at all, and it's risky because Apple routinely rejects PhoneGap-based apps; although they're written in JavaScript which is technically allowed, Apple takes a dim view of anything not *originally* written in Objective C. I don't expect Apple to be changing direction and opening up their platform and their store. If RIM survives [fat chance] its app store may go in a different direction - but I'm not holding my breath: RIM is completely beholden to [evil] carriers.

The app stores are large and getting crowded. But the publishers, labels, studios and carriers are in bed with the Google and Apple markets, and they have real legs. The markets are evolving extremely fast, especially Google's (which is moving into music and movies, and even has meta-markets like AppBrain) and Apple is s-l-o-w-l-y migrating to a non-desktop iTunes store. The smartphone market is exploding, and every one of these devices has an icon on the front screen for the app store. I don't expect these stores to go away any time this decade - there's just too much money to be made.

18 Jan 2010

Rogers tells HTC Dream users to turn off GPS or 911 calls won't go through

On January 15 I received an SMS message from Rogers telling me I'd better disable GPS on my phone or I wouldn't be able to make 911 calls. This is the latest chapter in the unhappy saga of the HTC Dream on Rogers.

Rogers/Fido service message: URGENT 911 Calls: Please disable GPS location on your HTC Dream device to ensure all 911 calls complete. HTC is urgently working on a software upgrade and we will provide details shortly so you can re-enable GPS.

Instructions: Select Menu - Select Settings - Select Location - Uncheck Enable GPS Satellite

Message de Rogers/Fido : URGENT - Appels 911 : Veuillez désactiver la localisation GPS sur votre appareil HTC Dream afin de vous assurer que tous les appels 911 soient acheminés. HTC développe le plus rapidement possible une mise à jour du logiciel et nous vous fournirons les détails sous peu afin que vous puissiez réactiver la fonction GPS.

Instructions : Sélectionner Menu - Sélectionner Paramètres - Sélectionner Location - Désactiver les satellites GPS

First Rogers announces that they're not providing any more upgrades to the software on this platform. Then they announce that they'll upgrade Dream users to the HTC Magic for free (well, with a contract extension). Then the damn thing just doesn't work. Ah, the joys of early adoption...

I just want an Android device with a keyboard. Is that too much to ask?

4 Oct 2009

Distributed is the new Object Oriented

In the 80s, Object Oriented development promised a fundamental reshaping of the software development landscape, and it had distinct religious overtones. (You can tell it was religious because Object Oriented is capitalized.) It was going to be better in every way from procedural programming - everything would be reused, bugs would be eliminated, and mass love would result. Like Theravada Buddhism, once you accepted the Four Noble Truths of Encapsulation, Inheritance, Polymorphism, and Modularity everything else followed. This fever gripped the development world for twenty years, and thousands of developers never made the mental shift necessary to embrace it.

Leaders often made the fateful decision to rewrite existing procedural apps in object oriented technologies. Did the resulting programs run better? Um, no. Did they conquer the marketplace? God no. Did they run faster? Hell no. Windows Vista is a prime example; I'm not going to rehash any personal case histories because the pain is still too great. I'll let you know when I'm strong enough to cry.

Distributed development is as different from Object Oriented as Object Oriented is from procedural development. Most of the existing cadre of developers will never get this stuff, just as most procedural developers never figured out OO. Hadoop / MapReduce and Erlang require a rethinking of how problems should be solved, and a rethinking of what problems can be solved. Instead of figuring out how to best rewrite yesterday's apps with today's technologies, it's much better to treat them as solved problems and move on.

13 Jun 2009

Vancouver's Open Data, Open Standards, Open Source and the Vancouver Public Library

Vancouver has adopted a policy of Open Data, Open Standards, Open Source and I'm really excited about it. David Ascher presented on the topic at Open Web Vancouver 2009 and pointed out that if we don't engage the city and use this data it will go nowhere.

The Vancouver Public Library is one of my favourite places. I love libraries, I love books, but the library here in Vancouver is a really special library for me. So I've been thinking of ways that the library could share data so that I could build applications to make the library more interesting and more valuable to the people of the city.

Here's some data I'd like to have:

Books on order
I'd like to know what new books are currently on order, but not available. I want a preview of coming attractions.
Most unpopular books
What doesn't get checked out? What's likely to get sold in the next round of disposal, ahem, book sale?
Most popular books
What's everybody reading?
Top 100 sites for library patrons
What are the most popular sites browsed from the library? I'd like to be able to contrast this with the most popular sites according to Alexa. That should help tell the library what sorts of services patrons need.

These are things that I could mash up into interesting applications, such as presenting a unified view of new popular books on Amazon and which ones are in the library, or popular in the local community.

8 Dec 2008

Spam now leverages social networks

I've been getting spam lately purporting to be from a former co-worker. Apparently they harvested her MSN Messenger list – it impersonates her hotmail account and sends to my work account.

This was probably due to a virus which hijacked MSN messenger, it's a notoriously problematic service: between the service outages, trojans and viruses, its usefulness is debatable. But even as Microsoft gets its security act together a decade too late, the attack is inevitably shifting someplace else.

With social networking sites asking for email passwords to "import connections", people respond quickly. After all, they say it's safe, and you can always change your password later (but you don't). As it has been pointed out, as an industry we've trained people to type passwords, and that's what they do – whether it's a good idea or not, and that's why phishing is so successful. But once they have your contact list they can keep that forever, and it's a wonderful tool for a spammer.

Facebook and Twitter are unlikely to misuse this data too egregiously, they are connected to real money and companies with reputations to protect. But Pownce, which is going out of business – what about their data? And tacky little utilities like Twitterank which spam your stream, you'd better believe they're warehousing your connections. And your private messages. And everything else. You can put these things together and draw meaningful conclusions about the people involved.

Science fiction has been talking about spambots impersonating your family and friends for years, but now it's happening for real, and expect to see a whole hell of a lot more of it. Expect to start seeing requests from friends and family, asking for money through new and unfamiliar websites (or even familiar websites that have been compromised). Expect increasingly strange and subtle requests: you may not even know what they're really trying to get you to do, or why. In short, this is going to get deeply weird, really fast.

16 Nov 2008

Favourite packages for Ubuntu Intrepid

I recently upgraded to Ubuntu Intrepid Ibex, the 8.10 release. I use "upgraded" in the general term because the distribution upgrade option has never worked for me – I did a clean install.

Add the Medibuntu repository.

then:

sudo apt-get install aacgain acidrip acroread acroread-plugins audacious azureus cabextract easytag ffmpeg flashplugin-nonfree gstreamer0.10-ffmpeg gstreamer0.10-plugins-bad gstreamer0.10-plugins-bad-multiverse gstreamer0.10-plugins-ugly gstreamer0.10-plugins-ugly-multiverse gtkpod-aac hardinfo inkscape libdvdcss2 libdvdread3 libdvdread3 libxine1-ffmpeg meld mozilla-acroread mozilla-mplayer mozilla-plugin-vlc mp3gain mplayer msttcorefonts network-manager-pptp openclipart-openoffice.org nfs-common nfs-kernel-server portmap rapidsvn skype smartmontools smbfs totem-xine ubuntu-restricted-extras unrar vlc vlc-plugin-esd w64codecs wine

28 Oct 2008

Semantic web startup Twine hard to get wrapped up in

Twine is [yet another] site that offers recommendations for webpages, stories and information based on things that you've read. I've seen demos that are amazing, that pull together disparate threads of data in new and surprising ways. It is powered by some sort of fantastic semantic juju that allows it to create recommendations and connections that simpler probabilistic analyses cannot. Sounds good right?

The problem is that it is just too. damned. much. work. You start with nothing, and have to enter your links, from scratch, one at a time. You don't get any immediate satisfaction. Unlike FriendFeed or SocialMedian, it doesn't just figure stuff out based on your other activity elsewhere on the web. It doesn't even attempt to figure out what you already like. So all of the heavy lifting is left up to the user, and there's no immediate payoff. The new user is left wondering just what the hell this site is supposed to do for them.

So although it probably has good technology, so far it's a failure. If they don't realize that everybody's not suddenly going to start posting everything in their little walled garden with a promise of getting payoff, maybe, someday, they'll be left behind by other sites who have given a great experience out of the gate to new users. Other sites – Facebook, FriendFeed, etc. – can add this semantic hooey to their own sites at their leisure. Sometimes technology really doesn't matter.

23 Oct 2008

Central authentication is coming, and here's a good reason why

Some interesting reading today on OpenID, Facebook Connect, and the dog's breakfast of authentication standards in the market:

Facebook Connect and OpenID Relationship Status: “It’s Complicated” – John McCrea of Plaxo

The authentication landscape appears to be coalescing. I think a lot of vendors will still want to have a "walled garden" ID scheme, but I'm inclined to think their customers will drag them kicking and screaming into a federated identity world.

I have a good reason to think so. People already use a dangerous form of single sign in: they use the same user ID and password across multiple sites. Some day soon an enterprising young script kiddie from Yemen is going to sit down and write a Distributed Identity Theft Attack that will plunder the databases of weak sites (like some forum that you don't even remember signing up for) to take possession of more valuable sites (like Facebook and LinkedIn) and then finally the holy grail (your email account, used to unlock everything else). Nobody, not even Bruce Schneier (by his own admission), has a different password for every site: at best, we have low, medium, and high-security passwords. But if you're using the same password everywhere, you're only as secure as the weakest site you visit, which means gold bars for the putative Yemeni banks.

Also, über-paranoid password complexity and periodic forced password change rules actually encourage people to use a password formula across different sites, and to change only the last character in a preset sequence. They're virtually assured to do so, because security training has taught people to never, under any circumstances, write down their passwords. So a dictionary attack will still work in most cases for the DITA outlined above – forty-seven variants isn't a lot to try, and most sites don't lock accounts for password failure.

So go change your online banking password right now, I'll wait. Don't forget PayPal, too. And Amazon, which holds your credit card info, as does iTunes.

So, we'll stumble along with our user ID (which is, often as not, the email address) and password (same everywhere) until the Russian Business Network strings together some Perl code and causes a smart-spam and bank fraud wave big enough to shake consumer confidence in the web. At the very least, consumers will learn not to trust websites with homegrown authentication. They'll pick one or two big-name vendors they trust.

27 Aug 2008

Why I don't recommend the iPhone 3G

^{backup battery}

I've had the iPhone 3G for a couple of weeks, and although I think it's revolutionary, fantastic, useful, blah-blah-blah, poor battery life is its fatal flaw. For most people, a phone is a phone first and foremost, and other uses are secondary. I know I'm tethered to the internet in general, and email in particular, but a phone has to function as a phone, or it fails. To function as a phone, it has to hold a charge for at least 24 hours under light usage, and the iPhone does not.

This is fixable through software. By being extremely careful about how I turn on 3G and WiFi functionality I can make it work reliably through the day without charging more than once. But I shouldn't have to exercise extreme caution and constantly massage settings to make sure the battery doesn't discharge in bare hours: it is a computer and it should take care of it for me. This is not the traveling salesman problem, it's easy: if the screen is off I'm not using it and I don't need the 3G network, so stop trying to nuke my balls.

I don't know when Apple is going to clean up this mess, but I hope it will be within a few months. Without this fix the iPhone cannot be successful, and I totally want the iPhone to succeed. I've helpfully supplied Apple with a bug report on this just in case they haven't read a newspaper, blog, or spoken to a single sentient being who's used the device. Those of us who have it are doing everything short of implanting a car battery to keep these things running: extra charging cables everywhere, car chargers, and even expensive portable battery packs. Without a fix, this is a failed phone.

13 Aug 2008

WebEx is watching you, and won't stop

WebEx on the MacBook turns on the camera for no good reason, and doesn't let you turn it off.

I had a conference call yesterday, and as usual with these corporate time-wasters, there was a powerpoint deck intended to distract the audience from the carbon-14 decaying in their bones. I fired it up on the MacBook which I use for WebEx, because it doesn't work on Ubuntu and I've already wasted more than enough time trying to fix it. So it was going on (and on) repeating previous presentations, and I proceeded to try to get other work done.

When I proceeded to fire up Photo Booth to take a picture of an error I was getting on my iPhone I was told "The camera is already in use." That's weird, I thought. Sure enough, the little green light was on next to the camera. So I proceeded to close down apps. Finally nothing was left but WebEx, and when I shut that down the light turned off. Hmmm. So I started WebEx back up and started searching for the option to turn off the camera. And I kept searching. I couldn't find it, and that made me feel kind of dumb, so I sent in a support request to WebEx. Their response:

Hello Chuck,

Thank you for choosing WebEx.

Since you are using a built in camera, it starts automatically in the meeting. WebEx does not have any control over this and there is no option in the Meeting Manager to disable this feature.

However, if you are the host, you can uncheck the "Video" option while scheduling the session. You can uncheck this option even in the middle of the meeting.

To disable the webcam, please contact Mac Support or check in Mac Forums. For your convenience, I have provided a link which discuss about turning off webcam.

Disclaimer: The URL below will take you to a non-WebEx Web Site. WebEx does not control or is responsible for the information given outside of WebEx Web Sites.

http://osxdaily.com/2007/03/26/how-to-disable-the-built-in-isight-camera/

Please let me know if there is anything I can do to further assist you.

Regards
WebEx Technical Support.

Waitasecond. "WebEx does not have any control over this"? What the hell is that supposed to mean? Do they not have the flipping source code? WTFH? And then they recommend that I go into a console and hobble my operating system's camera support? Are they high?

Of course, that's just bullshit. They allow the host of the meeting to control the cameras of the attendees, but they don't allow you to control the camera on your own flipping machine. This is a backassward privacy policy. I have no idea or control over where my video is going – it could be recorded, it could be broadcast: millions could be watching me absently pick my nose.

There is now a piece of tape covering the webcam on my MacBook. When I first used the iPhone I thought that the camera warnings when using an app that touches the camera were silly, but now I greatly appreciate them.

Bad WebEx. I'm still waiting for you to go out of business, you silly $3.2B behemoth.

4 Aug 2008

Google's quality problem

Google's service used to be of the highest quality. As the company has grown, it maintained that quality – despite their famous eternal "betas", their work was actually very reliable.

I'm starting to see quite a lot of exceptions to that. I saw my first Google server error a couple of months ago (after eight years of using it), and since then I've seen a lot more.

Recently I decided to try Google Knol. Whoops... can't verify my identity:

Screenshot-502 Server Error - Mozilla Firefox

Then Blogger went psycho, out of the blue while scrolling down my blog page:

And today I saw an article about Google Translation Center. I thought "hey, I hope this works..." and sure enough, it doesn't let me sign up:

What's going on? Google is trying hard to grow its business beyond search-based advertising revenues, and at the same time it is trying hard to drive traffic which results in advertising revenues. As they do this they risk diluting their famous brand with low-quality attempts to solve new problems.

3 Jun 2008

Halting State

I bitched all the way through the first half of Charlie Stross' Halting State about how bored I was, and how I really didn't get it. Although the setup was slow, once the men from ONCLE came in it took off and went someplace I really didn't expect.

I'm not much of a gamer, and there are few games I've gotten sucked into (Ultima XII and The Sims, that's about it), but the vision of a future with pervasive mobile gaming woven into real life rings very true, and sounds very compelling (I mean fun). Stross delivers with new ideas in a fun setting (Scotland after independence from the UK) with logical progressions of the current geopolitical environment. My only complaint is that his usual characters pop up with new skins and do their usual mating dance, but that's pretty minor and wouldn't catch your attention unless you'd recently read Singularity Sky. My final verdict is that I highly recommended this highly technical and groundbreaking book.

2 Jun 2008

Tools by tools no longer cool

For a while there I thought that Microsoft was going to take everybody down with Visual Studio Team System. They'd take their superior IDE and debugging environment, add testing and fix their crappy version control system, and they'd own the world. "Nobody else will be able to deliver everything in one package," I thought. "They'll undercut everybody else until they own the landscape, and then they'll milk us like the clueless cows we are."

I even chose Perforce for a version control system. I looked at CVS and decided it was crap; Subversion was still not there, and everything else was just not good enough. "Microsoft uses Perforce," I thought, "and how wrong could they be?" (At that point I was still in fear and awe of Microsoft. Hell, I even thought Longhorn was going to rule the world.)

How different the world is suddenly. Yes, Microsoft has a beautiful IDE that permits you to smoothly debug Windows software. But who can afford to run web software on Windows? It is simply murder on a business model. And desktop software on Vista? Yeah, right. As a result, Team System is terribly quaint all of the sudden. Trac, Subversion (or Git if you're really cool), and BaseCamp are really all you need for web development, so why would you bother administering a SQL server database and a domain controller and an exchange server and a project server and a team system server and buying CALs for all of the above and along with the hardware to run it -- all for tens of thousands of dollars? And if you want to do truly distributed development between a core team, external contractors, or even (gasp) a wide community, Team System won't even do it. And there's the rub: that's the way software is built today.

Yesterday I saw an ad for Perforce: they're giving away a 2-user version, "No questions asked." Whoop-tee-doo, who cares. They can't even give that away. Microsoft versus Borland versus IBM was like a tyrannosaurus fighting a triceratops and a pterodactyl. It just doesn't matter.