Wikipedia Articles and Resources

NYT, 1/7/08: Wiki search engine

Wikimedia introduces wikia.com, a new search engine...

Mr. Wales expects his new Internet search engine, Wikia Search, an early version of which is being made available to the public Monday at www.wikia.com, to follow a similar trajectory.
“We want to make it really clear that when people arrive and do searches, they should not expect to find a Google killer,” Mr. Wales said. Instead, people who use the Wikia search engine should understand that they are part of the early stages of a project to build a “Google-quality search engine,” Mr. Wales said.
Like Wikipedia, Mr. Wales plans to rely on a “wiki” model, a voluntary collaboration of people, to fine-tune the Wikia search engine. When it starts up Monday, the service will rank pages based on a relatively simple algorithm. Users will be allowed and encouraged to rate search results for quality and relevance. Wikia will gradually incorporate that feedback in its rankings of Web pages to deliver increasingly useful answers to people’s questions.

NYT, 1/6/08: Borges and the Foreseeable Future

Article [1]

Yet a growing number of contemporary commentators — whether literature professors or cultural critics like Umberto Eco — have concluded that Borges uniquely, bizarrely, prefigured the World Wide Web. One recent book, “Borges 2.0: From Text to Virtual Worlds” by Perla Sassón-Henry, explores the connections between the decentralized Internet of YouTube, blogs and Wikipedia — the so-called Internet 2.0 — and Borges’s stories, which “make the reader an active participant.”

NYT, 12/15/07: Google's response to Wikipedia: KNOL

Google's competition: KNOL [2]

The service, called Knol, which is short for knowledge, would allow people to create Web pages on any topic. It is designed to include features that permit readers to submit comments, rate pages and suggest changes. However, unlike Wikipedia, which allows anyone to edit an entry, only the author of a “knol,” as the pages in the service would be called, would be allowed to edit. Different authors could have competing pages on the same topic.

NYT 11/19/07: Wiley book uses 5 paragraphs from Wikipedia

No big deal! :-) [3]

The publisher John Wiley & Sons confirmed last week that its book “Black Gold: The New Frontier in Oil for Investors” by George Orwel had lifted almost word for word about five paragraphs from a Wikipedia article on the Khobar Towers bombing in Saudi Arabia.

NYT 3/16/08: On-line vs printed encyclopedias

  • Start Writing the Eulogies for Print Encyclopedias, NYT article March 16, 2008 showing that all printed encyclopedias are going to stop publishing and go on-line. [4]

Books on Wikipedia

  • Help write the last chapter of a book on Wikipedia. Chapter [5] deals with future of Wikipedia [6]. Author is Andrew Lih, early Wikipedia author/user.
Interesting way of writing. Reminds me of senior editor of Wired who is writing (as of 06/08) a book

on the pages of his blog, allowing everybody interested in providing insights/suggestions.

Wikipedia as Newsmedia

  • Wikipedia announces death of Tim Russert 30 minutes before NPR [7]
  • Russert's death updated on Wikipedia before NBC has time to talk to the family [8], NYT article, 6/23/08
  • Get the NYT articles on Wikipedia here [9]

MediaWiki Formats

  • 'Page table' -MediaWiki [10] MySql organization of Page tables
  • Mysql format of text table in MediaWiki [11]
  • API - MediaWiki [12]
  • Revision table - MediaWiki[13]
  • Quick instructions for running your own MySQL queries on Wikipedia [14]

Wikipedia and trustworthiness

  • Papers on Web Contents trustworthiness by Pierpaolo Dondio et al. [15] Trinity College Dublin -
  • Index of /~seigneur/publications [16] at U. Geneva. Writes on Trust issues in networks and wikipedia.
  • Software Weighs Wikipedians' Trustworthiness [17] in the Chronicle of Higher Education. 8/3/07
    • Extracting Trust from Domain Analysis: a Case Study

on Wikipedia Project [19]

  • wikipedia trustworthiness. on digitaleccentric.blogspot.com.[20]
"There was a brief article in the Chronicle of Higher Ed last week that
I didn't spot until yesterday -- UC Santa Cruz researchers have developed a
simple yet clever test of the trustworthiness of wikipedia article authors..."
  • A Content-Driven Reputation System for the Wikipedia [22] article by Luca de Alfaro.
Abstract: We present a content-driven reputation system for Wikipedia authors.
In our system, authors gain reputation when the edits they perform to Wikipedia
articles are preserved by subsequent authors, and they lose reputation when their
edits are rolled back or undone in short order. Thus, author reputation is computed
solely on the basis of content evolution; user-to-user comments or ratings are not used.
The author reputation we compute could be used to flag new contributions from
low-reputation authors, or it could be used to allow only authors with high reputation
to contribute to controversial or critical pages. A reputation system for the Wikipedia
could also provide an incentive for high-quality contributions.

Wikipedia and ontologies

  • An Ontology automatic construction approach with multi-dimension data mining oriented column, ChengBoyuan et al. (no electronic version available)
  • [23] Decoding wikipedia categories for knowledge acquisition, by Vivi Nastase and Michael Strube.
  • Harvesting Wiki Consensus - Using Wikipedia Entries as Ontology Elements by Martin Hepp1,2, Daniel Bachlechner1, Katharina Siorpaes [24]
Abstract. One major obstacle towards adding machine-readable annotation to
existing Web content is the lack of domain ontologies. While FOAF and Dublin
Core are popular means for expressing relationships between Web resources
and between Web resources and literal values, we widely lack unique identifiers
for common concepts and instances. Also, most available ontologies have a
very weak community grounding in the sense that they are designed by single
individuals or small groups of individuals, while the majority of potential users
is not involved in the process of proposing new ontology elements or achieving
consensus. This is in sharp contrast to natural language where the evolution of
the vocabulary is under the control of the user community. At the same time,
we can observe that, within Wiki communities, especially Wikipedia, a large
number of users is able to create comprehensive domain representations in the
sense of unique, machine-feasible, identifiers and concept definitions which are
sufficient for humans to grasp the intension of the concepts. The English
version of Wikipedia contains now more than 850,000 entries and thus the same
amount of URIs plus a human-readable description. While this collection is on
the lower end of ontology expressiveness, it is likely the largest living ontology
that is available today. In this paper, we (1) show that standard Wiki technology
can be easily used as an ontology development environment for named classes,
reducing entry barriers for the participation of users in the creation and
maintenance of lightweight ontologies, (2) prove that the URIs of Wikipedia
entries are surprisingly reliable identifiers for ontology concepts, and (3)
demonstrate the applicability of our approach in a use case.

Local belief

  • A new framework for local belief revision, Doukari, Jeansoulin, Wurbel. (no electronic versions of paper available)

Wikipedia and its uses in academia

  • Making wikis work for scholars, Inside Higher Ed. [[25] Introduces Citizendium, an attempt to link scholars and

wikipedia to enforce validation of contents. Also discusses Scholarpedia as an attempt to have a peer-reviewed process and authors having their "own" pages, i.e. owning a page. (I don't think this could/will fly).

  • Wikipedia in academic studies [26]
  • Wikipedia-lab.org [27] a project of Kotaro Nakayama who's research includes wikipedia mining [28]

New companies using Wikipedia

  • Powerset Debuts With Search of Wikipedia [29], article in NYT 6/18/08 on Wikipedia semantic company, powerset: powerset url: [30]

Wikipedia Scanner

  • EA Staffer Attempts to Alter Wiki History [31]. Article in Shacknews.com using the Wikipedia scanner to show that somebody had "cleaned up" evidence of somebody's involvement in a company.
  • See Who's Editing Wikipedia - Diebold, the CIA, a Campaign, Wired Magazine [32]
  • Seeing Corporate Fingerprints in Wikipedia Edits

[33] in NYT 8/19/07

  • Wikipedia Scanner outs Vatican, CIA [34] on News.com.au
  • Microsoft riles Wikipedia [35]: in seattlepi.com, January 24, 2007. Microsoft pays consultant to edit wikipedia entries on itself.

Wikipedia History usage

  • On Morfologik: How to use of Wikipedia history as a way to figure out typos and how they are edited [36]

General use of Wikipedia contents on other sites

  • Somebody asking questions in a blog about using Wikipedia contents on his Web site... [37]. Appeared on Daniweb Apr 27th, 2006.
  • Seen in HackDiary [38]

Using Wikipedia and the Yahoo API to give structure to flat lists.

Some of my recent (and final) work at the BBC has involved breathing life into old
rolodex-style flat databases of content. With my colleague Tom Coates, I've been
puzzling over how to take a list of text strings like this:
"AGNEW, Spiro", "ATTLEE, Clement", "BARBER, Anthony", "BEVAN, Aneurin", "BLAIR, Tony",
"CALLAGHAN, James", "CHAMBERLAIN, Neville", "CHURCHILL, Winston", "COULTHARD, David",
"DYALL, Valentine", "EDEN, Anthony", "FOOT, Michael", "GAITSKELL, Hugh",
"HAGUE, William", "HEATH, Edward", "HESELTINE, Michael", "JENKINS, Roy", "KINNOCK, Neil",
"MACLEOD, Iain", "MACMILLAN, Harold", "MARSHALL, David", "MILLIGAN, Spike",
"NIXON, Richard", "REDWOOD, John", "THATCHER, Margaret", "WILSON, Harold"
and turn it into a network of directed links like this [39].
Hopefully anyone who has a passing knowledge of the history of the British government
will agree that it's a convincing little map, easily usable as a basis for navigation
around the concepts attached to the text strings

Wikipedia Installation/Backup

  • Somebody's blog about installing and mirroring Wikipedia on a Linux system [40], in metachronistic linux

Statistics of Wikipedia edits

  • Who Writes Wikipedia? (Swartz 2006)[41]. Somebody simple and crude attempts at figuring out some statistics of who edits what on Wikipedia.

Competition for Wikipedia

  • Knol - Wikipedia, the free encyclopedia project of Google [42]

Misc. articles on Wikipedia

  • An Oracle Part Man, Part Machine, article in the NYT, Sept 23, 2007 [43]