Feeds:
Posts
Comments

Archive for the ‘Statistics’ Category

From Decision Science News:

What of the adage “the best predictor of future performance is past performance”? It seems less true than Sting’s observation “History will teach us nothing“. Let’s continue the investigation.

DSN did a nice analysis on a ton of baseball game out comes to see whether a team who had just won a game was more likely to win the next game.   There have been other studies like this involving basketball players “hot streaks.” Similar results revealed… well, it’s a crap shoot shot to shot, game to game.

Now, over the long haul winning records, shot percentages indicate there is some skill involved.  But at the micro level it just ain’t true!

Now why do we as fans, observers, interested parties believe in hot streaks, win streaks, etc. etc?   is it a side effect of some other useful thing we do in associating events?  or is there really some direct value in assuming immediate past performance indicates a similar future performance?

what can we test to figure that out?

the nba hot streak article has some insights….

Read Full Post »

This is a really neat, quick piece about Usain Bolt’s impact on sports writers and their comments about humanity and sport.

Usain Bolt

Usain Bolt

A good example of our unexpected things can shape thinking and approaches.  My favorite take on this so far is over at ScienceBlogs.  I love it that someone plotted the model of 100m times to see how far out Bolt is on the predicted trajectory of speed improvements.

Maybe we’ve got the model wrong.  Maybe he’s an outlier and the model is right.

I have a prediction of my own:  after the fun of this track event is over… somebody is going to argue he’s doping.

Read Full Post »

Update 3/29/09: Danny Sullivan correctly pointed out to me that he is a publisher and an advertiser.  I’ll disagree on the idea that he is a “real user”, by which I meant “regular user”, because he is not nor I am.  We study websites, traffic and human behavior – we notice and ignore and react to things very differently than a user just flying by to get the latest news and views.  I do agree with Danny that my argument mostly matches his… thus, I’m only calling out Clemons argument.

Update 3/28/09: Techcrunch keeps stirring this up.  Now Danny Sullivan replies…

The most damaging part of both of their arguments is that neither one is arguing Clemons original argument and rebuttal mostly fail to convince his claims about the death of Internet Advertising.  He’s conclusions don’t match actual data and experience from the perspectives of an advertiser, a publisher nor really a regular user.

These points are not defensible without real data:

Users don’t trust ads
Users don’t want to view ads
Users don’t need ads
Ads cannot be the sole source of funding for the internet
Ad revenue will diminish because of brutal competition brought on by an oversupply of inventory, and it will be replaced in many instances by micropayments and subscription payments for content.
There are numerous other business models that will work on the net, that will be tried, and that will succeed.

In fact, let’s consider some counter examples:

Someone sold 4 million Snuggies based on ads.  Did the people who responded to those ads not trust the ads?  Their behavior shows they did enough to fork over $15 bucks for a blanket with holes in it.  The better statement is some users don’t trust some ads.

Users do want to view ads.  Millions of people love superbowl ads and actually seek them out online and on their TIVOs.  Online only ads that people do want to view include the millions of mini games they play, youtube videos they watch, contests they enter.  A better statement is that some users to want to view some ads, especially when the ads are not engaging, useful or catchy.

Users do need ads.  Search engines and social graphs can only show you information about things that are already popular/reached tipping point.  They cannot show you stuff just coming out of the labs.  Users need ads to learn about new and different products and services.  And the only way to introduce people to new things is put new things alongside already known things.

Ads are not the sole source of funding for the internet. Anyone who is claiming this is what web companies think clearly has not really studied the industry or worked at a web company and/or companies that extensively use the web in their business models.

Ad revenue will continue to grow in the long run.  As long as businesses need to sell more product, more ad revenue will go into the market.  The difference is that the ad spend is spread among more and more entities, so individual businesses will get less ad revenue.

Many other business models already work. and more will be created.  Selling apps, selling computer time, renting server space, selling subscriptions, donor models, barters, licensing, premium access…. I mean, gosh.  I don’t think we lack for business models that work.  The media is simply pointing to the high profile failures of big media companies that haven’t figured out to how to shoehorn it’s model into the internet way of doing things.

Once again we see that pundits rarely represent the real story.  They don’t know the price of milk. Just talking to people in the industry and summarizing the conversation is not enough to predict the end of online advertising.

See below for rest of my original response.

—————

Despite the impressive length,  a recent TechCrunch guest feature on the failure of internet advertising fails to reveal what’s really destroying the ad model online.  Clemons neither states what he claims is actually failing and doesn’t really prove it is. Alas, I will still attempt to refute the possible implications of his claim.

It is not a particularly insightful observation that “The problem is not the medium, the problem is the message, and the fact that it is not trusted, not wanted, and not needed”. Of course people don’t like being distracted with ad messages.  That’s always been the case, that’s why marketers have to pay for ad placement.  Nothing new here.

Advertising itself is not broken nor will ever go away.  As long as companies have products they need to push into market, they have to advertise, regardless of nature of the medium.  Play with the language and state definitions all you want – advertising will always be a part of our lives and media experiences.

What’s wrong with the business models of sites that rely on advertising is the pricing, not the actual idea of advertising.  Spending in terms of dollars is down in all mediums, certainly.  However, the amount of advertising we’re exposed to is likely still growing.   I have a long post on all sorts of data points on this topic here.  The short of it:  marketers have a growing number  advertising impressions out there, everyone know’s how well they perform and thus the pricing is coming way down from the relatively overpriced “older” advertising models in print, radio and tv.  This shrinking pricing model puts pressure on the business from a margin standpoint and so the less efficient businesses fail.

Yes, I generally hate banner, text, billboard ads and neon signs like everyone else. Except when I don’t.  And when I don’t that’s valuable to the company that paid for that placement and it’s valuable to me to be notified of something I might have missed.  We’re just arguing price.

Read Full Post »

An interesting approach to knowledge mentioned in Stephen Wolfram’s blog:

But what about all the actual knowledge that we as humans have accumulated?

A lot of it is now on the web—in billions of pages of text. And with search engines, we can very efficiently search for specific terms and phrases in that text.

But we can’t compute from that. And in effect, we can only answer questions that have been literally asked before. We can look things up, but we can’t figure anything new out.

Let’s see where this goes!

Read Full Post »

Reproduced from  a private email by Mahesh Johari:

For most of the last 25 years we (as a nation) have been sold a story about investing in stocks for the long run.  Invest steadily, mindlessly, and over the long run stocks will earn almost 10% annual returns.  By the time this bear market has ended, this notion will be questioned by a great many.

I’ll give all of you a head start.

Let’s quickly review one of the greatest achievements of the past 15 years – the rise of the personal computer and the growth of the Internet.  During this time we saw two giants dominate this market – Intel (NASDAQ: INTC) and Microsoft (NASDAQ: MSFT).  They were the veritable Pippen & Jordan of the tech Bulls dynasty: nearly pure monopolists with gigantic profit margins, huge revenues, and fantastic cash flows.

Today, Intel’s share price closed at $12.08, the same level it was at when I turned 25 years old.  That was almost 12 1/2 years ago.  Let’s look at Intel closely and see what they have to show for this incredible run.

At the end of September in 1996, Intel’s share price was $12.08.  Adjusted for splits, there were 7.14 billion diluted shares outstanding.  The book value per diluted share was $2.09.

We could go into a brief academic debate about why I’m using book value instead of some other measure.  Book value is the accounting net worth of the company.  With some caveats, it is a reflection of value that takes into account all of what the company owns and all of its obligations.  The book value reflects the amount of capital the company has available to deploy productively in the course of business.

I use book value per share because one share of Intel essentially grants you ownership to that amount of book value.  If you simply hold that share, you could imagine that the value backing that share of Intel is growing by the same rate as the book value.  Book value is not influenced by the share price, which can fluctuate wildly with the market.

Compare it to your own situation.  If I asked how you have done financially over the last 12 years, I could look at your net worth 12 years ago, compare to what it is today, and have a pretty good idea of how you fared financially.  It’s the same idea.

Today, Intel’s book value is $6.80 per diluted share.  Over the last 12 1/2 years, it means that Intel has grown book value per diluted share at an annual rate of 9.94%.  Some of you will argue that I am not including dividends that were paid out.  Those dividends have totalled $2.25 over that time frame.  Including dividends, Intel generated annual rates of return of 12.50% over the last 12 1/2 years (assuming you didn’t reinvest the dividends).

That number sounds pretty good.  Until you realize that Intel was a near monopolist operating through one of the highest growth phases of their industry.  Think about that for a minute – a monopolist in a boom was only able to generate 12.50% per year in returns.  What does this imply for the 495 companies in the S&P 500 that are NOT monopolies, and are NOT going to be going through an incredible boom in demand?  A 10% annualized rate of return for the entire market suddenly sounds like a fantasy, doesn’t it?

So what the heck happened?  What about all those studies that touted how stocks would make 9-10% over long periods?  Is 12 1/2 years not long enough?

What happened is what always happens when people blindly follow historical statistics en masse.  The underlying behavioral model changes.  As naive shareholders piled in and stopped paying close attention to how the company was run, profits got transferred from shareholders to employees through stock options and bonuses.  Additional billions were blown on share buybacks at much higher prices.  Looking at the result after more than a decade of shareholder un-friendly behavior, it’s no wonder Intel’s results are mediocre.

The time to buy stocks for the long run will be when those that bought for the long run realize they have been fleeced.  When those people are so disgusted that they sell at low prices and the shareholder outrage forces companies to change their behavior – that is the time to buy stocks for the long run.  Price matters.  That’s how it has always been.

For those who like to check the numbers, I suggest the following:
Intel 2008 Annual 10-K:
http://idea.sec.gov/Archives/edgar/data/50863/000089161809000047/f50771e10vk.htm#301
Intel Q3 2006 10-Q:
http://idea.sec.gov/Archives/edgar/data/50863/0000050863-96-000040.txt
Intel splits (shown on chart):
http://finance.yahoo.com/q/bc?s=INTC&t=my

As a side note, if you actually bought that share of Intel in September of 1996, your rate of return was not 12.50% but 1.38% annually, assuming you didn’t reinvest dividends.  Ouch!

Read Full Post »

Latest data has Oscars Ratings up about 6% (see here and here), a little above 30 million viewers.

This was inline with what I imagined.

If you’re looking for Winners and actual info on the show, here ya go and here.

Read Full Post »

In the most “Duh, I knew that already” post of the decade…

It’s not all negative out there.  Some businesses actually benefit quite a bit in economically challenging times.

Some key examples:

I’m sure their are more examples.

Death, taxes, food, water, seeking work, support, debt… can’t escape those things no matter the economic conditions.

Read Full Post »

Ah, TechCrunch.  You whipped out the old Excel and made the industry famous Up And To The Right Chart.

There is no strong conclusion to draw from this very limited data set.  The only piece of interesting data is:  of the big 4 media companies presented here only Google has any sustained growth and makes up the majority of 4th quarter growth.

The Industry is NOT doing well at all.  Time Warner and News Corp had major losses, and they have huge stakes in the internet.  Yahoo, MSN, AOL all suffered major losses.  Mid-tier Smaller publishers are getting crushed, and you won’t see that in any of this data or any emarketer reports.

If companies in the media industry want to actually survive, get real.  The existing ad model stinks and this recession just nailed the coffin shut.  Don’t take my word for it, run your own analysis. (Just for fun, go look at the ads running on Yahoo, MSN, Facebook, MySpace, Video game sites… let me know if you see a direct sold campaign.  Let me know if you find a non adnetwork ad tag…) Hurry and do it, because this is one short runway and there ain’t no Hudson river nearby.

The biggest advertisers in the game (financial services, computer companies, and auto makers) all took huge hits and continue to falter.  The ad budgets have been slashed and they aren’t moving product.  With that mix, media companies can’t do anything about their ad revenue streams if they don’t find other ways of making money.

No, I’m not doom and gloom.  This is all about reinvention and change and exploration.  The old model stinks and now we get to find out what to do next that is better for the user and the advertiser.  This is good.  It is also painful. It is not Up and To The Right, despite the fact that excel seems to only spit out charts of that type.

Read Full Post »

A study attacking use of fMRI (functional magnetic resonance imaging) in social neuroscience as flawed and an overselling their results by scientists printed in Nature under the above name and authored by Alison Abbott has touched a nerve in science. [Yes, it is a double entendre…] Social neuroscience is the study of the neuro­biological mechanisms underlying social behavior.

The field frequently uses fMRI to reveal which brain areas are activated while a subject is exposed to specific social interactions or socially relevant content — e.g., situations that may evoke anger, jealousy, spirituality or guilt.

But a lengthy and no-holds-barred paper accepted for publication in Perspectives on Psychological Science and already circulating widely on the Internet, claims that many studies misused the statistics, measurement and methodology to make points and support positions what were not supported by the data assessed by a very large team of researchers that understand both the technology and the statistical logic used in almost all science fields.

Some of the papers authored responded with dismay, anger and guile. One is reported to have stated, “This is not the way that scientific discourse should take place.

This is indeed the way scientific discourse needs to take place. I am amazed that Ed Vul and his colleagues had the stones to delve into what others suspected but, for lack of a better term, were too lazy to make the analysis and then question individual authors for their imaging methods. In any event the methods need to be substantiated more now than in the past due to the proliferation of information. If left untested the number and aura created can generate a form of viral buzz that makes some research acceptable before it is validated, reproduced or reviewed by peers. Given the sensationalism with which some of the results are heralded and spring board people to be guests of Dr. Phil of Oprha’s show, the microscope needs to be every where if we as society continue to put science on the same dais as sports and financial logic. Clearly there is always the potential of a McCarthyisque tone now that the White House has turned its attention to other things than pillorying the efforts of science.

The amount of published research from imaging experiments has drastically increased over the last 10 years. Of course, some scientists are very knowledgeable and have a really strong grasp of their methodology, from both the data acquisition end and there are others who use a plug-and-play like approach. An area of the brain lights-up and both types of experimental teams have the opportunity to explain it for the reader what it all means. When this is done in retrospect enormous problems raise their shinny little heads.

The criticized authors complained about the immediate publicity of the criticisms. This may indeed not be the way that scientific discourse took place in the past but the Internet has changed the rules as all in politics and business have come to find out. Now science is seeing that ‘change’ is part of the lab and the lecture hall as well as the fMRI tube.

Questioning findings that were publicized is a requirement of science first and the populace second. There are thousands of journals and more coming every day. Some have yet to earn their chops. Their reviews are the same that get published there and have incestuous relationships like coaches in the NFL. Clearly, for science to not return to the dark ages it must regulate its content or someone else will do it for them and that elicits pictures of Alberto R. Gonzales or Rush Limbaugh or perhaps the former attorney general of New York State, Eliot Spitzer. It seems appropriate that the criticisms be addressed and answered by the same audience. Journals that accept manuscripts according to the chance that they make headlines in the popular press may want to consider a different strategy.

All experimenters better be sure they can explain their data, particularly those who don’t work in the field of brain imaging, and are at the mercy of the reviewers to assure them that those images weren’t made in Photoshop.

Short of that, use the Baloney Detection Kit published here. If the authors appear out to prove something rather than understand something you might want to apply the Kit. In either case, you are going to continue to hear about fMRI and interpreting results for many more years.

Now, you want to know what the real problem is?

There is a lack of empiricism in the questions being asked in the first place so this distraction is the same old pea soup in a kettle that psychologists, philosophers and non-scientists everywhere have wasted the eons away with. No matter how exact, exotic and sophisticated the instrument you use to measure, if what you measure is not observable, empirical and the concept is not falsifiable, it is equivalent to flying in an airplane at 35,000 feet at night and looking out the window for answers to even the big questions like, “What the heck is going on out there?’

Good luck with those subjective approaches… let me know how that pans out.

Read Full Post »

Failure to understand how users and money flow through the Internet costs media and etailers a lot of money every day.  There are huge misconceptions about where the “value” actually lives for user data, advertising performance and profit margins on all this high tech.

The following figures attempt to disambiguate some of the confusion.  The summarized conclusions come from a variety of data sources and real life experiences analyzing financial statements, traffic reports, advertiser analysis and experimentation.  Specifically one could get someone exact figures by combining comScore, Quantcast, Compete, Google Analytics, TNS, @Plan, SEC Filings, internal reports, revenue statements and DART forecasting as I have done several times.

This post is meant to be a demonstration of the core concepts, not a statistical treatise on the topic.

If you hate reading too much, skip to the end for a somewhat realistic example of how traffic flows.

Traffic on the Internet roughly splits 7 segments.  (as shown in the figures below).  These segments are defined by where the sit in the user experience by amount of consumptive behavior (clicks, reading, sharing, watching). How the user gets from segment to segment is not completely linear in actuality, but when you coagulate a users behavior you’ll roughly see a funnel in terms of time spent, pageviews and ad impressions.

Traffic Funnel

Traffic Funnel

The segments can be characterized also by their ad performance, ad targeting (how specific is the user in their activity), and their audience coverage (how much of the particular audience segment does a type of site/service reach)

Funnel Traffic Segments

Funnel Traffic Segments

Each segment has a different cost profile.  Here I look at labor costs to maintain and capital expenses to build and power.

Where's the Cost?

Where's the Cost?

As you can guess, each traffic segment has a different profit profile too.  This is largely the result of combining the advertising/revenue performance with the cost profile.  Certain Internet services simply do not have a strong profit opportunity because they borrow old models and/or cost more than the market is willing to pay. (Perhaps that will stabilize one day, but I think software tools and low cost hardware disrupt the demand curve A LOT because users can often supply their own demands once the cost gets too high, hence why TOOLS are the most profitable segment.)

Profit Margins by Segment

Profit Margins by Segment

Make no mistake about what I’m presenting here.  The profit online is all in retailing, portals/search and tools/utilities.  The stuff in the middle of the funnel is highly susceptible to competitive displacement and has very little intellectual property protection.  You can verify this conclusion by reviewing revenue statements and SEC filings for the big tech and internet companies.

The advent of citizen journalism and self publishing flattened the media market.  Owning a printing press was once “high tech” and a capital investment barrier.  Owning the right location on the main street was once a logistical barrier.  High speed computers and difficult programming languages was once a technical barrier.  Those 3 feature are gone.  Media is now, well, almost purely a creative barrier.  There’s a huge pool of creative talent constantly struggling against each other.  Creativity is worth a lot once it rises above everything else.  That happens so rarely to make it a bad investment.  Every minute more and more people enter the creative market (how many blog posts per hour? how many videos go up each day?… a lot.)

organizing, sifting, filtering, distributing, aggregating… that’s the sweet spot.  There is a technical hurdle, but the investment is worth it as there will never be less of a need to filter, sift, find, distribute.

This week we had a beautiful illustration of these concepts with the Presdential Inauguration.

Most of the US users watched the Inauguration, most on TV, a lot with online video streams and 2 million in person.  During and Immediately following the inauguration the Internet lit up with content creation and massive usage.  The portals and search engines featured as many new links and breaking stories to the news coverage.

The social networks shot pictures, tweets and status updates around, occassionally referencing links to the confirmation gaff, benediction speech text, and satelite pictures from DC.

Micro bloggers summarized everything as fast as they could, while the search engines and utilities sucked in that content.  The original content creators probably released a previously composed story and put that live.

Mainstream users shut down their video streams and took to the portals and search engine, seeking more info on what just happened or insight into a specific moment.  Most times they ended up at CNN or NYTimes.  Many times, but less frequently, they hit a blog that had some recent content.  Most users probably ran into a wikipedia reference link or youtube video.

Some users ended up on amazon to buy Obama’s books or some inauguration swag.  Finally as the day concluded and original content creators finally had enough time to craft something, users might find themselves falling asleep to a good OpEd on the history of the day or an interview with the Michelle Obama dress designers.

By 3 days later the amount of content available on the inauguration is 1000x greater than within the first 10 minutes.  Original content creators are hopelessly buried amongst the blog posts, tweets, continuosly AP feed CNN articles and YouTube embeds.  The bloggers are buried by other bloggers.  The news stories give way to other news stories.

The utilities that sort, sift, filter and monetize on it all just got a 1000x better experience and continue to catch the huge volume of user investigation and digging.  The own the head, the trunk and that dreaded long tail and collect user targeting data all along the way.

Read Full Post »

Older Posts »