Wikimedia Commons: a discoverability opportunity for the performing arts

Why is Wikimedia Commons a discoverability opportunity for the performing arts?

In a previous post pointing out the essential steps to a productive digital presence for the performing arts, we highlighted as a best practice the sharing of images in the Wikimedia Commons media library, a sister project of Wikipedia and Wikidata, under a free to use Creative Commons licence.

This practice deserves our attention because even if it may seem complex or disorienting, it is above all an extraordinary way to benefit from the positive bias of search engines.

We’ll be looking at Wikimedia Commons from three different angles.

PART 1: Why use Wikimedia Commons
PART 2: How to publish and describe content in Wikimedia Commons
PART 3: How to set up a customized rights clearance strategy for existing and future audiovisual content? (future post)

PART 1: Why use Wikimedia Commons

In search results, an image draws us in more than 1000 words would.

An increasing number of results search engines generate these days are enriched by audiovisual data. In fact, 22.6% of internet searches are done in Google images (2019). An image is more than ever a way to be discovered. The impact of images on the discoverability of books has been demonstrated and there’s no denying the importance of images for live performances: images are the heart of live performances.

The performing arts community as a whole has every reason to adopt a more systematic and strategic approach to publishing good quality, reusable images that can be freely shared.

Why should I use Wikimedia Commons?

Because Wikimedia Commons is a privileged reference space for search engines looking for meaningful images. By making millions of royalty-free images available with descriptive data, Wikimedia Commons meets a very specific need of search engines.

In fact, the great greed of the Googles of the world for images and videos is seemingly an enormous paradox since even though search engines cannot interpret the content of an image or of video data, the results they generate increasingly include more visual content. It is indeed very difficult for a machine to “understand” the content of an image or a video, contrary to text data which machines can easily “understand” and use because of natural language processing.

An image or a video is what is called “unstructured” data. Even if an image contains very rich data for human eyes, it cannot by itself become an answer to a search query because it has no descriptive metadata, apart from its file name and its (often missing) descriptive title, which would allow an algorithm to understand the image’s content. The image itself is not structured enough to be interpreted by a machine.

The machine does not “see” what we see. To overcome this blindness, algorithms need access to so-called “structured” information, which provides an exact description of the content in a language that can be understood by both humans and machines. This is precisely where Wikimedia Commons comes in.

Three overriding principles have made Wikimedia Commons a first-rate reference site for search engines:

Audiovisual documents that are freely usable, and therefore shareable
Identified audiovisual documents, with a set of descriptions that can be interpreted by both humans and machines
An environment that allows for multilingual indexing, especially for image captions, which adds to the discoverability of documents imported into Wikimedia Commons

Wikimedia Commons and the performing arts: the time has come

Wikimedia Commons currently contains more than 80 million media files (images, videos, sounds, 3D animations, etc.) that are freely usable and that anyone can enrich. In only four years, 60 million of these files have been provided with structured data that makes them easy to find, display, and reuse.

This progress was made possible thanks to a project that focused on this objective. The ISA tool and many inspiring projects (see them here) have thus been created and contribute today to develop this very valuable “data culture” around the image. The Swedish Performing Arts Agency, for example, is in the process of publishing more than 5,000 audiovisual files on Wikimedia Commons.

Wikimedia Commons is a preferred reference source for search engines

The reliability of results is a big issue in the field of search engines. Algorithms have very few ways of evaluating the quality of their results. One of those ways is the reputation of the reference site, and this is where Wikimedia Commons really stands out.

Wikimedia Commons is the only ecosystem without any personalization. It’s collaborative, it’s multilingual and it’s open source, an encyclopedic universe that is a de facto guarantor of the authenticity of information that can be distributed forever. The performing arts sector could use it as the focus of efforts to enhance discoverability while benefitting from increased visibility. The performing arts community’s inclusion in the grand design of the semantic web would be greatly enhanced with the help of such a major player.

For 10 years now, Wikipedia and its offshoots have been preferred by search engines 99% of the time because their data is well structured and reliable. The Web is currently undergoing a process of centralization and it is of utmost importance that going forward, we use efficient and stable web sites.

Wikimedia and its constellation of resources lead the field. Wikimedia Commons appears with Wikipedia and Wikidata as the third pillar of a perennial presence and increased discoverability for the performing arts, which, as you may recall, are increasingly active and present in Wikidata and Wikipedia.

Next steps

Contributing to Wikimedia Commons will provide better discoverability for all stakeholders and contribute to amassing a visual heritage for the performing arts.

Publishing tools are available to anyone who wants to publish one or more images in the media library. The process is simple, and we invite you to discover it in our second part, “How to publish and describe an image in Wikimedia Commons” (forthcoming).

Finally, you can learn more about the possibilities offered by Creative Commons licenses in the third part of this exploration of Wikimedia Commons, “How to set up a customized rights clearance strategy for existing and future audiovisual content” (forthcoming).

Author: Véronique Marino with the contribution of Frederic Julien.