Sifting through knowledge

this is the first in a series of quixotic articles that I will post this week, based upon some observations and ideas that I shared since January 1st.

or: open-ended idea sharing (as usual, under “creative commons for attribution”- i.e. use as you wish, but remember to quote&link the source).

a small experiment-within-the-experiment: I will keep longer articles on my blog, posting instead the 1000-word variety on social networks, but the announces for each article will be always posted on @robertolofaro.

this article will start with a digression on “online profile building and searching”, and then will get back to the title, while leading gradually from the mere sifting of knowledge, to knowledge ownership and its relationship with our institutional framework.

I am quite confident that you have more online profiles than you need

last Friday I had a conversation (one of many) about a website, and I was asked: what do you think about the profile feature?

if you worked with me on data-intensive projects whose audience was supposed to be senior managers, since my days for an Andersen Consulting units on Comshare’s DSS (decision support systems; late 1980s) solutions, you know that I have been heavily influenced by my short experience in politics within a European advocacy and other extra-curricular activities that I carried out before I started to officially work, in 1986.


  • I think that information is never neutral
  • consolidation is a selective process (you choose how and what to consolidate, hence- which details will be “drowned” by the data)
  • the transformation process that starts with data (selective by itself, as we humans have limited time and an even more limited attention span) and ends up with information (i.e. processed data) is even more selective than consolidation (who decides what is relevant?) into knowledge (i.e. what you, personally or as a group, know)
  • presenting data, or, even worse, information, should be done keeping in mind the forma mentis of the audience (if you want to ever see that converted into knowledge, i.e. embedded in your audience’s knowledge)
  • if you followed this line of reasoning so far, there is only one logical conclusion: if you want to make whatever you deliver “stick”, you have to make it relevant to each member of the audience, and you need therefore to know at least a little bit about them before you open your mouth (or write the first line of that report)

therefore, whatever the reason you collect the data for, to make sense implies also thinking beforehand about its potential uses: you can reclassify, restructure, consolidate, summarize only what you have been provided, and you collect what is relevant, not “just in case”.

albeit, in our technological time, I saw more than once “executive support systems”, under various labels, deliver more than anybody could be able to chew in a reasonable amount of time (and “reasonable” is a relative concept).

more than once my feeling was that that overload, coupled with the poor “format” of the presentation (i.e. ignoring what happened after the delivery), had the aim to shuffle under the pile inconvenient truths.

so that the producer could claim that the data were there, if and when something happened and the producer had to cover his/her back (if you want: a producer hedging against potential troubles; one of the side-effects of ISO9000, SOX and other “governance” drives introduced without really changing the internal checks-and-balances and “carrot and stick” system).

sometimes, what you need is meaningful data- quantity is less relevant than quality: and no matter how large the “data collection”, often it takes mere seconds to spot conceptual issues or any data quality concerns.

to give a visual example: after supporting a COO/CFO, I was offered a position as Financial Controller (
if your website is a generalist social network, then probably you provide more information than is really needed to provide the services that you use- as they focus on revenue generation through advertisement, marketing, and so on.
in 1992, at 27- yes, I am that old); then, I had to pass the interview with the CEO, and the first question was… here is a report from our regional managers, and here is the budget, what do you think?

I said: if the managers are asked to deliver a monthly progress report vs. the budget on each line item, I do not see which method they share to derive from a yearly figure (the target budget per line) the monthly figures: and that was the acid test- everything we talked about before and after was mainly “padding”.

eventually, I did not sign the contract (anyway- I kept being a consultant until 1997, when needed), but I must say: it was one of the most interesting job interviews I ever had- and, as I had to see more than once in the following couple of decades, it was the kind of interviews I had with CEOs, as more technical/detailed issues were subject to interviews with other people (or trial projects- unless they already knew me through other sources).

as the title of this section says: aren’t you as bored as I am of all those profiles that you have to fill up (and not only when registering on job websites- also any website for any company or social network)?

also because quite often each profile reveals the mindset of those defining its structure, or the uses that they will do with the data that they are collecting (e.g. data mining for marketing), not the value that the profile delivers to you, the user, or any other member of the potential audience.

if your website is a generalist social network, then probably you provide more information than is really needed to provide the services that you use- as they focus on revenue generation through advertisement, marketing, and so on.

and how often did you find a website asking even irrelevant information, not even useful for their own purposes?

but I assume that a basic use of anything within your profile is enabling other users to find you.

if your website is targeting a specialist community, you probably have multiple potential audiences:

  • other users
  • people looking for groups of users
  • people searching “clusters” of users
  • etc, etc

the difference between a cluster and a group? the former is basically an assessment that identifies people sharing some characteristics, the latter is a pre-defined classification of people.

as an example: if I have access to the appropriate data, I can see that 20% of the profiles within a certain website are irrelevant to my search, 40% are potentially relevant, 30% are somewhat relevant, 8% are relevant, 2% are my real target.

then, I can see “who” is within my 2%, and maybe sample across the others (incidentally: allocating time by priority- e.g. detailed analysis on my 2%, down to keyword searching, and adopting a mix of criteria to search in other categories).

the key concept is: I, the user, have to be able to choose what is relevant to me.

instead, most websites let you search only in a quantitative way (e.g. a tickbox for those with a certain degree, another one for those who like surfing, and so on).

if you have multiple audiences, I think that you should provide a search method that is relevant to the audience- qualitative clustering instead of quantitative grouping is the first that I keep advocating.

in my discussion on Friday, I said: from my “qualitative” interviews, I think that instead of tickboxes a “mix” of characteristics could be more useful for “qualitative users”.

as an example: sometimes, managing a small-budget project with high-level stakeholders is more critical to build a successful business relationship than the run-of-the-mill, highly industrialized rota-learning management of pure delivery projects that are based on a preset series of components.

I assume that you will agree that building a single villa for somebody who is on the front cover of magazines is not the same as building the 100th copy of the usual 200-units block.

a qualitative search would enable preliminary vetting of those who fit a certain “behavioural profile”, instead of having to wait for a face-to-face meeting to discover that you were looking for a negotiator, and instead received a control freak.

if you attended any of my presentations or brainstorming sessions, even as a friend or at a party, you know that I cannot resist: I think through pictures, and eventually I will use anything available to underline my words with a “multimedia” approach- e.g. going as far as using bread crumbles to show how proceed in a negotiation under the parameters that you provided, or shifting glasses and bottles, or even using the standard tool for any strategist, the paper napkin (and yes, sometimes I jot down curves to show why something does not make sense).

but often I like to visualize quantitative information through qualitative means, and viceversa.

last Friday, I did what I started doing as soon as I had an online website (I registered the first website in 1997, share a link to an article where I explained visually how to move from quantitative to qualitative and viceversa.

specifically, I said: why don’t you add a search feature that uses a “radar chart”, where the person searching can move markers across few dimensions of analysis, and then receive the profiles within the range that (s)he selected?

example: find me those who are

  • on reading, between “book worm” and “frequent reader”
  • on cooking, between “cooks well” and “enjoy cooking”
  • , and so on
  • I wrote “between”, because you can limit the variation as I did in the article (e.g. from 1 to 5), but you could also have a different, more “fluid” range, instead of the usual binary yes/no.

    then, you can do as I did in the article: show how each one of the profiles found matches the profile you were looking for.

    I used a similar approach whenever I had to advise a customer on selecting new employees/collaborators, getting a new supplier, or negotiating a partnership (or the reverse): set a model, look at what you have, and then see who and what (not necessarily in that order- sometimes, the “who” is predefined, while other times the “what” is).

    sometimes this implies multiple iterations, as some of the candidates have complementary strenghts and weaknesses, and, if somebody able to facilitate the activities is available, the first round of assessments could actually shrink down the group, and maybe a joint delivery can be designed: of course, if everybody agrees.

    but maybe the search facility should be tailored- e.g. allow not only the “radar chart”, but also to first sift through data to find your own “parameters” and “ranges”, and then save your own preferences (yes, having some pre-built on the menu would be useful, but enabling then to “customize” them would be even more useful).

    I think that reading few books on the social side of online business could be helpful, also if you are not really interested setting up a website, as I saw offline spill-over effects of online business patters more and more often (incidentally: my online library is on LibraryThing):

    as for the “guru industry”: there is a compulsive visibility element that, in the end, often clashes with the original aim- spreading an idea (i.e. the means become the ends).

    but it is fine if you are doing that as a business, and even if the authors of the books that I listed are definitely in the “guru” business, usually the first or second book is still worth reading (in some cases even more).

    Searching knowledge

    I would like to shift back to the “search” concept: as Internet is just like an enormous haystack, we need something useful to find that needle we are looking for.

    unfortunately, Internet is even worse than the proverbial haystack: as each needle is spread, copied, distorted, linked with something else, and not necessarily relevant.

    but while long ago we had Yahoo, based on the “library catalogue” concept (i.e. “librarians” classifying links), through various steps we moved onto different paradigms- Lycos, Altavista, and, of course, Google, to name but a few.

    and, as many others, I am frankly puzzled by the idea that the search engine should “refocus” my search based upon what I already searched (not the details, but the areas of interest).

    why? because it is a kind of self-fulfilling prophecy: you start focusing search_number_two based upon search_number_one, then search_number_three is even more “focused” (and if you use a whole searching history, it is even worse- as the user is never really obtaining the results that could be expected, due to the “filtering”).

    side-effects? imagine what would happen if you were to apply that to your library- the more you search, the less books are used for future searches, based on your historical preferences (and this, even ignoring the potential new books that would get added to the library).

    or your newspapers: if you follow a newsworthy “thread” for few days, you will get more of the same in the future, whenever you visit that website.

    certainly a more intuitive way to search would be useful- albeit probably in the future there will be “individual” search approaches.

    when I visit a library, I often there is an online catalogue, but then, the way I search, or use the search results to feed further searches is still primitive: you cannot “dig into results” to further seed searches and redirect my search, as that has to be done manually.

    even after standing in line to pick up books and discover that, if you just had had access to a mere “tag cloud” about the book or the table of contents, you would not have waited at all.

    instead, this is an area where keeping track not only of your library history (as I discovered a couple of weeks ago that happens in Italy) but also having a “smarter” catalogue that asks you what is the aim of your current search could help.

    disclosure: I applied some time ago for positions at Google, but I never worked with or for them, also if I volunteered some ideas (through funny their feed-back system: tailored to avoid any risk of IPR ownership claim, as it is the only one I know that does neither forward you a copy of what you sent, and the only one that even sends a “thank you note” with no reference whatsoever to what you sent)- but in this article I write as an analyst, not as a candidate (former or future).

    few years ago, in 2009, a friend suggested Wolfram’s Alpha, a different kind of search engine, a cross between the old Yahoo (the one that manually classified links) and an expert system (understanding what you are asking, and looking for answers, not for links).

    at the time, people said: is this complementary to Google?

    well, recently Google started adding something that could eventually converge with Alpha, making the question almost irrelevant.

    almost: because, in the end, Alpha for the time being returns processed information, i.e. knowledge (e.g. closer to Wikipedia articles with some additional information, based on the meaning of your request), while Google returns processed data, i.e. information (e.g. you can ask “exchange rate 80000 EUR in yen” in both- and see the difference), but then it is up to you to add context.

    Alpha sometimes is like those people who, when asked how to get from A to B, will tell you the answer along with the history of the itinerary, the best restaurants, list the hotels near B, and so on- a little bit an overkill.

    but, if you go for more complex queries, or queries where the context makes the difference (e.g. try just “irrelevance” in both), Alpha is still more “humane”, while Google is closer to a librarian that, when you ask about something, sends you to the section where you can find books about that subject, but provides no guiding advice.

    the (current) difference between the two systems? According to Wolfram:

    WOLFRAM|ALPHA uses built-in knowledge curated by human experts to compute on the fly a specific answer to every query. SEARCH ENGINES index web pages, then look for textual matches, then give you lists of links.

    or: “Wolfram|Alpha is a computational knowledge engine, not a search engine.”, i.e. Alpha is a framework to associate “knowledge frames” prepared by experts with current, relevant data, while a search engine is a giant sieve.

    maybe yesterday, but tomorrow?

    Stifling innovation

    do you remember the “Don’t be evil” mantra from Google, broadcasted while Microsoft was considered to be the evil empire?

    long ago, once in a while I met or heard of startups focused on creating something complementary to Microsoft Office- but something that made so much sense… that eventually Microsoft added the feature, few years down the road: e.g. look at the features of Excel in late 1990s, and look at the complementary software products available on the market then, and look at Excel 2010.

    few people really use the Pivot Table- mainly because the first versions were clumsy, but both Office 2007 and Office 2010 deliver Pivot Tables that, frankly, cover 80% of the uses I saw since mid-1990s for business intelligence tools.

    I am referring to the real uses by real users- not the theoretical features that nobody uses, or that require a technical expert.

    it was common knowledge: the “evil” part was that Microsoft seemed focused on absorbing anything in sight.

    but it wasn’t alone: I remember a time when it seemed as if Oracle released a product after each innovative use of Oracle on a project (before it started buying companies).

    it is part of the software industry DNA: in my talks with colleagues around Europe, I was told more than once stories of when they developed a solution complementary to a software that they distributed, and the (American, European- no difference) publisher of the original software, despite all the “intellectual property protection” brouhaha against emerging markets, had no qualms about trying to get for free the idea.

    usually telling “we will all benefit from this”- repeating the old “What’s good for General Motors is good for the country”, misquoting from C. E. Wilson (instead of his “I thought what was good for the country was good for General Motors and vice versa.”).

    nowadays, the “Don’t be evil” is on the path to become a marketing or legalistic ploy- look at the number of new products and features that are derived from the market, more than adding to the market.

    the latest innovation is the “unified privacy”- it makes sense from a technical perspective (for Google), but why should my access, registration, usage data on YouTube be cross-linked with my usage of Google, Gmail, or the contents of GoogleDocs and Gmail messages?

    it is actually a way to stifle innovation: no new service will be able to compete with the marketing (meaning: competition, growing trends, and even what others are doing and reporting on competitors) knowledge that cross-product users (and even the content of what they write or receive from others…) will provide to Google.

    the difference with Microsoft is that the platform (Google+Android) is pervasive, and becoming more a business layer on top of the Internet than just yet another service provider.

    therefore, it controls not only the service platform (as it was for Microsoft Office or Windows), but, in the end, also the distribution channels (think: if you want to go online, more often than not you end up there).

    even newspapers, book publishers, public libraries eventually made agreements with Google: it was easier and cheaper than any of the alternatives.

    it reminds Microsoft’s approach- each new product entered the market using the funding provided by results delivered by previous products, competing in a new segment where the resources available to competitors were significantly lower- but it was the MS-DOS/Windows marketing DNA, wasn’t it?

    corporations too: I read and heard quite a few considering dropping personal computers and shifting to tablets.

    of course- in a market, any company tries to expand: but expanding to the point where no competition is possible?

    private monopolists do not innovate- in our growth-obsessed culture, they look for new ways to increase the turnover and profitability.

    consider the integration of your online access and search history as the perfect CRM database, providing visibility on anything you do, use, read, write, say online.

    would you develop an application or service, if you knew that, even before you can complete the kick-starting phase, the incumbent competitor knows more about your market’s real feed-back and your own weaknesses and strenghts than you do?

    maybe you would, but who would finance you?

    but there are more far-fetched potential side-effects linked to the ownership of the perfect CRM database that is used by over 90% of the potential global audience to “leave behind” traces of their own private and business life.

    Who owns knowledge, and under which jurisdiction

    I like reminding a non-business case: as an experiment, Google monitored trends on reports about the flu– and the company was able to get on board the CDC on board, as Scientific American reported (but Popular Mechanics partially disagreed with the value of the trend-spotting: with reasonable objections).

    but I am equally sceptical about the recent innovation about privacy in Europe, i.e. the eventual right to obtain that any information that you posted or “left behind” online is removed- whenever you ask to do so (my older take on privacy, nationality, and Internet neutrality, posted in 2010, after the CEO of Google stated that privacy was going to die).

    on Google, I half-jokingly wrote long ago that it could eventually use its own patents to build a self-sustaining community, maybe on an artificial island (as Sealand).

    or even end up as the first global infrastructure that is directly managed by the UN.

    as for the concept of “nationality”- well, more than in 2009, when I posted this article on a virtual country and a virtual currency, and taxing the digital economy, we are getting closer and closer to a convergence of issues: our laws and regulations, including the WTO-related regulations and privacy, assume that we are dealing with a hierarchy, linking companies to countries.

    but also data ownership is becoming a fluid concept, that often is not so easy to enforce.

    nothing really new- as it is closer to the issues raised by proliferation within a knowledge economy.

    I think, and I wrote long ago something more structured about the subject, that we are heading toward a borderless world, but that does not necessarily imply that it needs to become a no man’s land.

    what if the “global search”, “global profile”, “virtual currency”, and “net neutrality” were to eventually be acknowledged as the first practical steps toward creating the legal framework that will lead to the transformation of multinational corporations into yet another form of state? what would, then, be the approach to be used to move from our current concept of jurisdiction?


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s