The way this is laid out it looks like the entities that are in a row are associated, it's confusing. You name the columns at the top and label the ranks on the left.
I love the use of different-sized circles, but I feel like you're comparing apples to oranges in a way. I think the circles should all be relative to one master size (#1) only their own category; rather than comparing Ron Conway to Google his circle could be of equal size. This way when searching for entities their circles would make more sense because the user is aware of the consistent max circle size. By comparing Conway to Google you're giving up a wider scale you could be using for the Peoples' circles.
Still though, cool project. I found the 5-person startup I'm interning at this summer :) Makes for a great crash-course on who matters if someone were trying to study up on the startup scene.
We tried both options and liked using one scale for all categories. The idea is that you are comparing influence, or something like it, so that can be visually compared across categories. You do have the downside of having very small dots for the people after only a few pages, but since there are > 100,000 people, most are going to have the minimum sized dot anyway.
Dave McClure at #3, Paul Graham at #943; whereas 500 Startups is at #41, and Y Combinator is at #15?
Why is 500 Startups classified as a financial firm, whereas Y Combinator is classified as a company?
Also, I'm surprised that Andreessen Horowitz is ranked #67 of financial firms (given that Marc Andreessen is ranked #14), and Elon Musk is #119 for people. I would have thought they'd both be ranked much higher.
In the data source, TechCrunch's Crunchbase, the influence of Paul Graham, Jessica Livingston, et al. is captured in YC the company, and Marc Andreessen is split between his personal identity and AH. So you get a lot of odd things like that.
As for why YC got classified a company, who knows? It's accident of history. They should change it.
"... This was one dataset we used to get started but, we intend to incorporate data from Angel List and other sources as well. ..."
Sooner the better, TechCrunch data is iffy ~ http://www.flickr.com/photos/bootload/2913315731/ though useful for a start point. Did you do a select on the companies to check for multiple listings?
Co-builder of this here. Hope you like it! Don't forget to click on a dot - then you can walk the graph of relationships among startups, people, and financial firms.
This is really interesting. I'm surprised you found that startups with higher centrality raised less money. Not sure how to explain that. Your investor graph has edges pointing both from investors to startups and back, right? Anyway, thanks.
I actually did a lot of Crunchbase clean-up for http://seedtable.com - if people are interested in this perhaps we should collaborate on some kind of data cleansing project.
Want to hustle? This is a great list to go through and familiarize yourself with these people, what they make, what they need, how you can help them, what you might ask them for if they ever offered to help you, etc. Great way to visualize Crunchbase, which is a great resource.
Minor pagination issue: when you click "next" the "prev" link takes its place. I went back and forth between the first page and second page a couple times before noticing.
Paul Graham's credit is mostly subsumed into Y Combinator's rank (#15 among companies). Yahoo has bought a lot of startups. But yeah, there are some things that don't make sense.
I think you can only say that this is a visualization of how things are in Crunchbase, not "the startup world" in general. It's ranking things based on the number of connections each thing has. For angel investors this may be a good indicator, but not necessarily for companies or VC firms.
One request, in the UI, where you have checkboxes, can you add 'exclusive' select. So for example, I want to see ONLY 'acquisitions' made; right now, we have to deselect everything else manually.
Very cool, but you really need to work on the algorithm. It's easy to pick out the reason why MySpace might be found higher than Facebook or as a few people have pointed out, PG vs YCombinator.
But my last start-up, which I shutdown over a year ago (HearWhere, 19,883) is ranked as more influential than companies that are significantly larger traffic that are still operating (example, AllRecipes, 23,087).
Nice to think I'm that influential, but I can assure you, I'm not (yet ;))
I'm not sure if you're joking or not, but there's no such thing as an "impartial" ranking of things.
Any ranking will put importance on certain factors which are chosen from all the possible factors by the designer of the ranking. You have chosen to rank based (I presume) on incoming links on CrunchBase. However, this is intrinsically no more impartial than a ranking based on yearly revenue, or number of employees, or size of their offices, number of mentions in the New York Times, or any other of an unbounded number of metrics.
I would argue that PageRank, while it may have done a good job of ranking websites for the purposes of search, is a pretty poor choice here. It's highly susceptible to reporting bias, where some companies will be better-represented on CB. Empirically, you also get weird artifacts, like the ranking of MySpace ahead of Apple.
Choosing a good metric requires a theoretical explanation of why that metric is important and why it helps answer your question (which you don't state, but I suspect is something like, "which are the most important startups", whatever that might mean). Just choosing something doesn't necessarily tell you anything useful.
[If you're interested in high stakes debates about rankings, look into the hoopla that comes about every year when US News & World Report releases their college rankings. These numbers can have huge effects on college's prestige and, lower down the food chain, their bottom lines.]
Yep, there are a number of surprises. For one thing, this is cumulative, so deals made and competitor relationships from 2005 are as good as from 2012. For another, having a well-flushed out Crunchbase profile makes all the difference.
Try clicking on MySpace's blue dot and see what is connected to it.
I love the use of different-sized circles, but I feel like you're comparing apples to oranges in a way. I think the circles should all be relative to one master size (#1) only their own category; rather than comparing Ron Conway to Google his circle could be of equal size. This way when searching for entities their circles would make more sense because the user is aware of the consistent max circle size. By comparing Conway to Google you're giving up a wider scale you could be using for the Peoples' circles.
Still though, cool project. I found the 5-person startup I'm interning at this summer :) Makes for a great crash-course on who matters if someone were trying to study up on the startup scene.