Feeds:
Posts
Comments

Archive for December, 2007

Summary (PDF of Draft Analysis)

What started for me as a typical “read Slashdot” for a minute has turned into a full blown research project into collaboration. The participation in solving the N-BRAIN Master Software Developer challenge delivered huge amounts of experiential as well as quantitative information regarding social collaboration on software projects.

This is a particularly good research situation because the stakes were reasonably high (potential job interview, Slashdot ego boost, public display of skill), the timeframe condensed, and the entire thing is trackable/auditable.

This blog post is the results of my findings so far (less than 12 hours after the solution to the challenge).

The Set Up

  • Unknown company posts a want ad on craigslist that includes an invitation to solve the challenge for a chance at an interview. Read here for launching point.
  • Full job posting here
  • Slashdot.org community picks it up quick and several developers/technical people set to work. Initially using Slashdot comments to post back and forth
  • The easy clues lead first to a Google Group, bringing together the challengers
  • The community forms of its own accord with no prodding or seeding (that we are aware of)
  • Google groups becomes repository of thoughts, questions, ideas, code samples, files, conversation, drawing board (please see for the final code samples and all that. Very impressive stuff)
  • Google groups tracks all contributions by login (handle), topic (community assigned), and datetime stamp
  • The Challenge urls
  • Background info for the layperson
  • Tools Used in Challenge
    • Programming Languages
      • Perl – character counts/frequencies encoder/decoder
      • Python – character counts/frequencies
      • Java – for encoder/decoder
      • Piet (npiet)
    • Software
      • Photoshop (to count pixels)
      • Npiet (for test analysis)
    • Sites
      • WhoIs.net
      • NetworkSolutions
      • Craigslist
      • SlashDot
      • Google Groups
      • Wikipedia
      • TinyURL
    • Historical Figures and Places and Times
      • Henry Ford
      • Samuel Smiles
      • Charles Buxton Going
      • Boulder
      • Servus
      • Flavian II
      • Turing
      • Van Gogh
    • Processes/Techniques (list from PeterOfOz, contributor)
      • Game playing (recognizing a Tetris like pattern)
      • Javascript, Perl, Python, and Java programming (probably others as well)
      • Knowing how to inspect HTML pages, and includes for javascript and
      • CSS
      • Web research (finding the original Ford passage, Pi lookups, Latintranslations, etc)
      • Lateral thinking and pattern analysis/recognition
      • Cryptographic analysis
      • Graphic formats
      • Numerical sequences (pi)
      • Byte code engines
      • Encoding/decoding engines

Questions

This analysis focuses on several questions:

  • Quantitative
    • How quickly was the problem solved
    • Relative percentages of general contributions to key contributions
    • Distribution of contributions over time and by person
    • Classification of contributions
  • Qualitative
    • Can a group solve things faster than a really talented individual (yes! We squeezed in 400 manhours in 18 real hours)
    • Is there any correlation between quantity and quality (hard to tell. This was a complicated challenge and the solution didn’t need to be anything more than a one off solution.)
    • Are there biases by contributor (80/20 rule, is 80% of the work done by 20% of the people) (yes! But different levels. Breakthroughs supplied by handful of people, grunt research supplied by another group.)
    • What makes a successful collaboration (solving the problem, of course! but doing it with fewer errors, better documentation, on time, on budget.)
    • What didn’t work (redundant work on encoder/decoder, multiple threads going at once, timezone differences without known “schedules” kept folks out of sync at the end… would improving these speed up this solution? improve its quality?)
    • What were some of the group dynamics (more to come on this in later posts… roles people filled…)
    • What schedules of reinforcement were at play (more to come on this… the feedback loop of the group and how code/solutions become reinforcers)
  • What I wish I had access to (Companies if you are reading this, please provide it will be WORTH IT FOR ME TO ANALYZE IN TERMS OF GOODWILL AND PUBLICITY. UPDATE 12/24 morning: N-Brain reached out to collaborate!)
    • Traffic Logs from Google
    • N-Brain (company behind it) assumptions
    • Traffic logs on N-Brain
    • Interviewees Invited
  • Follow Up Analysis (will follow up in January or sooner)
    • Traffic generated to the end site, n-brain.net (can tell in quantcast.com, compete.com, and alexa)
    • Traffic generated to http://wanted-master-software-developers.com/
    • Profiles of the contributors (get resumes/cvs/bios and/or some basic demographics)
    • Success of N Brain Product Release

The Analysis

Key observations

There was almost NO FLAME WARS/NEGATIVE COMMENTS AT ALL

Very little correlation to posting frequency/amount and breakthrough chance (biggest breakthroughs produced/cited by some of the least frequent posters)

Key Facts

Dataset

Over 600+ postings, 300+ real contributions, 25 breakthrus (less than 10% of contributions were breakthrus)

Took 18 hours and 132 people (73 contributors, 59 observers) to solve challenge.

No Slashdot comments were included in this analysis. It should be noted that several key findings appeared there first. The main finding being the google group to launch the real challenge. Many of the key postings on Slashdot were made by persons who migrated to google group, so it should not affect analysis too much.

Workload

Estimation that approximately 19 people put in 10+ hours. Approximately 400 man hours put in, with more than half by 19 people (analysis adjusted for sleeping time and by timing of contribution. E.g. if contributor had to sleep, discount 7 hours)

 

5.65 contributions per person. Max contribution count was 25. Minimum was 1. Most people contributed less than 5 times. It should be noted 3 of the key breakthroughs came from contributors with fewer than 5 contributions.

Peak activity and Peak breakthroughs not correlated

Classification of Workload

(classifications subjective to analyst. Probably could use a second eye)

Most of the contributions were research or clues. A lot of research chased down dead ends or irrelevant facts. Very little banter or small talk. No flames on Group. A few on Slashdot.

Breakdown of contribution classifications by Contributor.

Note: the data has been scrubbed for contributions/postings that weren’t mere banter or blank. (I full admit to likely misclassifying and even misassigning breakthrus and solutions to contributors. Please correct me if I did.)

Note: I considered breakthroughs as contributions that were sub solutions, code implementations that lead somewhere or key insights into clues.

Please SEE PDF FOR TABLE ON CONTRIBUTOR BREAKDOWN
(PDF of Draft Analysis)

Conclusion

N-brain got more than their money’s worth for creating this test. Beyond uncovering great talent, they learned a lot about collaborative development, especially in a wide open problem set.

Open style collaboration is incredibly efficient. We squeezed in 400 manhours into an 18 hour period on a holiday weekend.

There’s room for all types. Almost all contribution behavior that HELPED was quickly reinforced (follow up analysis of feedback loop to follow). Anything that was redherring or slightly counter productive was extinguished almost immediately. We had one instance of information withholding early on that was quickly eliminated and never resurfaced.

Tracking of projects happens quite naturally now with all our web based toolsets. No disruption of creativity or coding occurred and we have a fully analyzable project.

We need to analyze more of these situations to give businesses, organizations and individuals a strategy for existing in this flat global world. More on this later…

What do you conclude?

~Russ

Read Full Post »

Man, whew! had a great last 18 hours DORKING OUT.  i’ll admit it.  i just participated in one of the biggest dorkouts ever.  It’s relevant to business, behavior and media because it represents EXACTLY what is so crazy and different about doing business in a connected world.

Sometime around 10am PST this story hits slashdot.org:
http://developers.slashdot.org/developers/07/12/22/1746220.shtml

In this post developers are keyed off to a mysterious job posting for Master Software Developers.  The job posting contains a list of attributes and then a challenge to find out who the company is and what the significance of the date 1/18/2008.

The initial solving the challenge begins in the comment threads on slashdot but quickly migrates to Google Groups as the first piece of the challenge is solved – a URL is uncovered in the posting of the job title based on a base64 encoded string at the bottom of the fake job posting AND a redirect URL to a google group is “encoded” in the main style sheet of the found website.

Those of us first arriving at the google group work quickly to port loose threads on slashdot and get an organized thread/conversation going on the google groups.  We quickly uncover a huge amount of clues that are related to current tasks in the challenge and future tasks.  A few javascript gurus educate and code the group through the first task which is a test driven development of a javascript function.  Some of the rest of us reverse engineer the site uncovering an image which clearly has an encoded message or a useful pattern.  We also uncover an interesting css file that, again, looks as though it has an encoded message.

In fact, it’s quickly realized by the group that this challenge is going to be a long series of encoded messages, each one getting more complicated than the first.

At this point, the group starts showing strengths in different areas.  We find some folks that are well versed in ciphers (encoding messages), some that are quick coders, others with great eyes for clues and patterns and so on.

The first message we uncover is the word “collaborate”.  This was found after decoding a message embedded in the original test page which was only revealed by cleaning up and “indexing” a snippet of text about Henry Ford found from completing the javascript function successfully.  One person posted a great javascript function, several folks indexed the quote, and several other folks found the hidden message.    At this point we were pretty good as a group, but definitely not all working100% together.  Some folks had gotten ahead.

But then bam.  it got hard. real hard.  No one splintered off to go their own way.  the group converged on one thread in the google group and a someone started maintaining summary pages of “What We Know”.  The real work began.

A couple of people set out to decode the hidden message in the CSS file.  I, personally, set to work on the code in the image file.  On suggestions from others I chased down some image analysis that went no where.  Someone solved the css file which lead quickly to get us to the final task, without us yet fully completed the second task.  It was extremely useful though because we got a bigger view of the problem set.  this continued on for sometime…

It got absolutely amazing when everyone collaborated on decoding the image file.  An amazing amount of work went into finding patterns.  People posted a variety of analysis.  finally someone noticed, for the second time!, PI.  Pi was somehow involved in the image and PI had been hinted at earlier.  it was a great tip that lead quickly to uncovering a difficult-ish cipher for our last 2 puzzles.

A few code gurus pounded out a decoder based on that cipher. (that was impressive to me!).

The clues came forth.  Most of the rest of the task was clue hunting, not coding.  it took about 6 man hours to finally put it all together and uncover the final answer.

sometime between 4-6am PST the answer went in to the challenge websites.  SOLVED.

Early in the task speculation bubbled up about possible association with a movie coming out on 1/18/08.  We shirked that speculation early (though it came back up a lot), which proved to be right.

The challenge was put out by a Boulder, CO company, N-BRAINN-BRAIN produces Collaborative Development tools for programmers… go figure!  the answer happened to be the release date of their software.

This was such an unbelievable collaboration.  I was personally engrossed enough to take my laptop and cell phone modem to my child’s gymnastics practice and to make sure I was connected at a holiday dinner via my smart phone.  I put in at 12-14 straight hours myself. and for what?  THE CHALLENGE and the exhilaration of working with other people equally excited.

No doubt N-BRAIN will get some good tech press for their new product.

I suggest picking through the Google Group: http://groups.google.com/group/wanted-master-software-engineers

You’ll get a full view of the story and threads and approach.

There are many interesting learnings here.  The big one for me is… collaboration on challenging problems where the approach can grow organically can be extremely powerful.  i.e. this group had a goal.  the method was not prescribed.  use any language, use any tactic… just go.  The second big thing… how much more quickly did 50-100 people working together solve a difficult problem than one would do on their own.  This problem wasn’t limited to one domain – it involved ciphers, image analysis, pattern recognition, HTML/CSS, basic research, javascript and more.  In other words, you’d have to be EXTREMELY talented in a huge amount of things to really solve this independently this quickly.  Sure, all the knowledge is out there, but as an individual it’s hard to find and absorb it all quickly.

I also learned a ton about ciphers, using eclipse quickly (that java encoder), piet interpreter, samual smiles, henry ford, history of boulder…  really a huge scope of learning for the saturday before christmas!

I owe this story a follow up.  Really, there’s some incredible behavioral analysis possible here and I want to ferret it out.

For now, I must return to the other world of Christmas, family and all that.  this time without a smart phone under the table!

Read Full Post »

I’m a bit behind some of the other early movers…

3tera.  Taking grid and virtualization in a different direction.  They provide services for entire virtual clusters, virtual data centers, and more.

If implementing massive super computers and data centers becomes little more than filling in a sales web form, watch out hardware, hosting, and desktop sellers.

Perhaps google will get some competition now that massive CPU resources are being made available to anyone with an idea.

~Russ

Read Full Post »

  • HIV drug… down to one pill.  quietly on the market…  this will change lives
  • .net 3.5 release… finally a really robust .NET release
  • Moonlight/Silverlight – microsoft’s relatively quiet push into multiplatform
  • iphone’s pressure on other handsets… the phones that came out for all platforms kicked ass and made mobile a real platform in the US
  • xbox live’s continued improvement… this converged device is not going to go away and the consumption/tracking is unbelievably useful to businesses
  • online videos destruction of the usefulness of comScore, nielsen as industry benchmarks
  • circulation auditing for news/mags joining with online audience measurement – introducing real performance metrics into a speculation business (offline advertising!) is disruptive
  • Planet Earth HD series -very special visuals of our planet.  may change the nature doc approach forever and certainly was a technical/logistical acheivement
  • vonage knockoffs – voip is here…
  • bad airlines – generating angst to improve travel in novel ways instead of through more shitty snacks and revised loading procedures
  • unlimited calling plans – makes mobile services possible for the masses
  • gps everywhere – mapmakers watch out…
  • mars rovers – they just keep on going and really make a case for clever space exploration
  • solid state memory and memory prices – no need to ever worry about storage. no really
  • elections starting so early – will politics ever be about policy again or just getting elected?
  • green marketing – what a horrible spin job it’s all turning into.  prices go up for green and organic when in most cases it’s CHEAPER (hm, i’ll find some data on that…)
  • rubyonrails (again) – it made other language communities make it just as easy.  not last year, this year
  • … more when I have more time

~Russ

Read Full Post »

Google Talk now does on the fly language translation.

This is huge.

It makes my bluetooth Googler Intelligence adapter that much cooler.  Yup.  I’ve been working on a bluetooth ear piece, clothing clip on that will listen to ambient talk around you and do look ups on anything it hears or you can set it to respond to a cue (clap, tap, voice command) from you.

The idea is that you can find out about anything via Google, anywhere you get cell service, without looking down or interrupting normal behavior.

Translation matters because you can do this in ANY COUNTRY.

Basically my service uses google talk to look things up via IM bot that queries google and/or other resources like wikipedia.  It can read back (text to speech) the abstracts, definitions, calculations, etc. into your ear, or print them on a HUD… now it will be able to do it in any language.

CONSIDER THAT.

~Russ

Read Full Post »

Grid is here and it’s a game changer.  Not today, maybe not totally in 08, but certainly in the nearish future.

What is grid computing (cloud computing), you ask?  well, it’s lots of things.  Generally it refers to the idea that you can rent N number of cpu cycles to compute whatever you need.  Run websites, crunch datasets, run simulations, parse logs.  whatever you need to do, just rent the cycles to do it from grid computing providers, companies with excess cpu time or from your friendly neighborhood tech.

Grid is useful now because the tools to benefit from it are finally easy enough to generate adoption.  Amazon’s EC2 Cloud computing is amazing.  Really it is.  A webservice approach to setting up custom “nodes”.  Billed simply into accounts you probably have had for years.  Tons of documentation, samples, support developers… all there for you.

Yahoo just invest in Hadoop which is somewhat of grid computing.  Google is a gigantic grid computer system (use GWT to take some advantage of it!)  All available to Fortune 5, government, and YOU!

Technically, this matters because you can do a lot more when you do have to sweat the cycles.  Really.  if there’s no computational limit to what you are doing (other than can you afford it) all sorts of new services can be created.  New games, new investor tools, new education software, new advertising, new communications, new social networks.  Bandwidth was the first big damn to break.  With giant pipes readily available, we got to move away from text only experiences.  Look what’s resulted!?!  Computational power is another damn we’re breaking.  Retargeting of content, behavior analysis on the fly, improved AI…  all available to the common dev.  That’s huge.

At first I thought it would hurt hosting provides, hardware makers and so forth.  Actually though, i think it’s additive.  It’s yet another tool we can all use. It doesn’t replace always on, dedicated servers nor fast locked down storage.  It simply gives us lots of cycles as we need them to do interesting things.  And because I can’t see the future in any detail, I can’t make any claims about what it might do to existing industry and technology.

If you haven’t played with this stuff or even read about it, you need to.  It likely will be embedded in most online (and what isn’t online anymore?) within a decade.  web services and ajax was just the tip of this type of thinking.

Here’s what I want to do with cloud computing:

  • Find largest Mersenne Prime Number
  • Power my Decision Engine product (evolution of search engines to actually guide decisions)
  • Hook into ad servers to reforcast in realtime and retarget media based on behavior
  • Hook into a swarm of networked NXT bots to create social behavior across geography
  • fingerprint all YouTube videos and categorize based on transcripts and similarity scores (good for targeting ads or finding related media)
  • Create first homegrown weather forecasting simulation from Global models to Weather On the Ground. make freely available to all
  • Analyze social networks in real time
  • create a bot to play halo 3 for me all the time, but actually using the controller and data on SCREEN!
  • more more more

~Russ

Read Full Post »

FRAGILE MEDIA

Have you ever gone outside your apartment or house to look at the connectivity – the actual wires leading to your internet, phone and television? It’s chicken wire.

Have you ever been to a typical datacenter housing all this data we like to consume? Mostly chicken wire and not secure.

In this day of security and war on terror and infrastructure taxes, the fragility of our data streams is ironic. (is that the right term? You get my point…)

Few people understand the cobbled together ecosystem.

  • HTML was meant to link academic papers together and it is a loose standard (no one follows it) that helps generate trillions of dollars in revenue
  • Cable, dsl have no redundancy. If your line is cut, you’re off the grid
  • VOIP has little location ability and no power backup, rending 911 and other sensitive operations impossible
  • Most websites are single points of failure
  • If google is down for 1 minute, no fewer than 100,000,000 queries will go unanswered and websites will lose one million dollars (per minute)
  • Viruses are EASY to write and even easier to transfer
  • Ipod batteries are not replaceable
  • Cellphones work only 90% of the time in the CITY, and even less in rural areas (and these are the future of communication
  • Writers striking can shut down a huge amount of content flow and ad dollars
  • And so on…

The question is… so what? It all seems to work. I will follow up with how this very fragility MAKES it all work even better.

Good Morning.

~Russ

Read Full Post »

There’s just more and more analysis and speculation about how critical it is to be fast.

Slashdot linked to this fairly decent NYtimes article about Google and Microsoft.  One of the key points, which I actually agree with, is that GRID COMPUTING IS AT THE CENTER OF ALL FAST INNOVATION IN THE VERY NEAR FUTURE.

I have tons more to talk about with grid computing and will save that for a later entry.

I wanted to also point out the recent Yahoo! announcement of their investment in the Apache Foundation and the hiring of a key person there.  They don’t hide from the fact that this will help them be FASTER due to the bigger, quicker open source community.  It’s clearly yet another move to compete with Google and Microsoft.  It’s particularly strategic considering Apache is a foundational component to so many web things and lucene and hadoop (more grid computing!) are directly competitive with Google, Microsoft and Amazon.

Recently Google announced they are working on these web things called “knols” which are basically wikipedia/about.com style landing pages to search results.  (Gee, all those landing pages on the web are worth something? go figure.)   This is a “get it faster” strategy too… as it get me more ad dollars faster than waiting for wikipedia and others to put google ads all over the place…

i could go on and on about moves big companies are making to simply KEEP UP.  it’s damn near impossible to keep up the pace at any company.  Why?

  1. The talent is fluid.  They can do themselves or they can go to the competition
  2. Big companies almost always get slow.  Start ups don’t have the cash flow to invest heavily in things like super computing, lots of bandwidth, etc. etc.  (one way or another something is subverting speed)
  3. The foundations of the technology are changing quicker than we can agree on standards.  (Over 10 wireless standards, no browsers work the same, .Net is on version 3.5, vista isn’t taking over, intel macs make them a force (but who knows how to code that), flex, silverlight, ruby on rails, python, ajax…).  Without standards it’s hard to educate buckets of programmers.  Without lots of programmers, hard to transfer knowledge quickly.
  4. If the programmers aren’t getting it fast enough, who in the organization is?
  5. Transparency to users – they get their say and they say it hard and fast.  course correct quickly or it’s over
  6. China
  7. India
  8. Tampa – cheaper US labor markets, accessible for high tech/remote projects
  9. and tons more reason

In fact, I have so much to say on SPEED, i’m going to launch into a series of posts on who is fast, what it takes to be fast, what undercuts speed, how you can’t fail fast enough………

fast to bed now.

~Russ

Read Full Post »

One of the great mindbenders of our lives is End of Life directives.

Here’s a list of where directives go haywire:

  • unclear language in the document
  • document “enforced” by someone other than the subject of the document
  • document not present during decision making
  • assumptions and pre conditions by family, self, doctors
  • written policies surround use of dnr
  • unwritten policies
  • spur of the moment context/second guessing
  • diagnosis of what’s really End of Life
  • lack of directive standards
  • and so on…

It really bends my mind to consider that one of the most final decisions we can make about ourselves or family members is this damn gray. in previous posts I talk about the data collection and precision targeting of our world, and yet, with directives we bring NONE of that approach.

Directives are a terrible information device – at this time. What can we do to clear them up? what can we do with the context surrounding them? is it just a matter of experience – the more we interact with them the more precise and effective they become?

things that make you go… hmmm… argh… help!?

Check this site out for tons of cool analysis and concepts.  http://www.eperc.mcw.edu/ff_index.htm

~Russ

Read Full Post »

Halo Stats (no laughing)

Check out my halo 3 stats.

This is an amazing example of how much data is available to mine.  You can see how i’ve played the game, how others view me, how changes to the game changed my play, how I reacted to marketing.

Match this data against news about Halo 3, census data, labor statistics, macro economic indicators….  combine it with facebook, linkedin, opensocial networks, google search data… and on and on.

We can do this for far more than gamers. and lots of companies do!  What’s cooler (more scary?) you can do it yourself to others.

~Russ

Read Full Post »

Older Posts »