Skip to content

What's in your download directory

Directory structure

In your download bucket you will find the following structure:

graph LR
    root[GCP Bucket] --> 1[1292766]
    1 --> 11[1292768.aif]
    1 --> 12[1292769.mp4]
    1 --> 111[1292823.webm]
    root --> 2[9876]
    2 --> 21[1235.mp3]
    2 --> 22[12315.mp4]
    root --> 3[inst-metadata.csv]
    root --> 4[inst-stats.csv]
    subgraph media[MediaID]
      1
      2
    end
    subgraph clipid[ClipID]
      1
      11
      12
      111
      2
      21
      22
    end
    subgraph meta[Media Metadata]
      3
    end
    subgraph stats[Statistics]
      4
    end
    click 1 "#media"
    click 2 "#media"
    click 3 "#metadata"
    click 4 "#stats"


linkStyle 0,1,2,3,4,5,6 stroke-width:1px;

style clipid fill:transparent,stroke:#323232,stroke-width:1px,stroke-dasharray:5;

Media

Each media item has an ID. This matches id in the metadata file. Within this directory, you will find various version of your media items with different encodings (such as H.264, MP3, etc).

Metadata

This file (in CSV format with headers) helps map the ID of the file to its metadata. In this file you will find the following fields.

field Description
filename Internal File Name
id ID of the Media Item
collection_id ID of the original collection
title Title of media
creator CRS ID of the uploader
publisher Publisher (as entered)
copyright Copyright Owner (as entered)
language Language (as entered)
description Description
abstract Abstract
transcript transcript
keywords Keyword
visibility Permission (world, cam-only, acl)
acl which acls had access to this media
aspect_ratio calculated aspect ratio
screencast Was this a screen cast (true/false)
image_id ID of thumbnail (not transfer)
type Video or Audio
archive_path Not used
archive_file Not used
dspace Not used
dspace_path Not used
src_filename Original File path
have_data Always true
dest_filename Original Encoding path
status Always Uploaded
priority Encoding priority
withdrawn Not used
featured Shown on Home page
downloadable Downloadable from sms.cam.ac.uk
branding True/False
created Date Created in ISO format
last_updated Date updated in ISO format
updated_by CRS ID of last person who updated
explicit Not Used

Stats

Also included is a CSV of downloads, up to 10 Jan 2025. Statistics are recorded per day, per clip item. Other fields are shown below:

field description
day Date (DD/MM/YYYY)
clip_id Matches to file name above
is_rtsp Real-Time Streaming Protocol
is_itunes Downloaded by iTunes
media_id Matches the directory above
collection_id Collection ID
instid Institution ID
format type of file (e.g. webm, mp3, mpeg4 etc)
quality quality of media
fetch_type download (dl) or stream
is_cam Is the Downloader within the Uni of Cam (true/false)
country country of origin
num_hits number of hits
num_bytes number to bytes downloaded