datapm(1) data packaging system and utilities

SYNOPSIS

datapm COMMAND [OPTIONS]

DESCRIPTION

datapm (data package manager) is a command line tool and python library and for working with Data Packages and interacting with data hubs like those powered by CKAN

COMMANDS

about

About datapm

clone src-spec path [format-pattern] [url-pattern]

Download a package (i.e. metadata and resources) specified by src-spec to path

Resources to retrieve are selected interactively if no format-pattern is given. If provided, the optional glob-style format-pattern and url-pattern arguments are matched against the format and url of the resource to determine whether it should be retrieved.

download src-spec path [format-pattern] [url-pattern]

Download a package (i.e. metadata and resources) specified by src-spec to path

Resources to retrieve are selected interactively if no format-pattern is given. If provided, the optional glob-style format-pattern and url-pattern arguments are matched against the format and url of the resource to determine whether it should be retrieved.

dump pkg-spec path-of-resource-within-pkg

Dump contents of specified resource in specified package to stdout.

help

Show available commands

info package-spec [manifest]

Get information about a package (print package metadata). If manifest specified then show manifest info rather than package metadata.

WARNING: if you change the metadata for a python distribution you may need to rebuild the egg-info for changes to show up here.

init [path-or-name]

Initialize a data package at path. Package Name will be taken from last portion of path. If path simply a name then create in the current directory.

license

Show the license

list [index-spec]

List registered packages. If index-spec is not provided use default index.

man

Show the manual

push [source-file] [webstore-url]

Push local package in current directory to remote repository specified in .dpm/config. Alternatively push a single file to the webstore.

register rc-spec dest-spec

Register package at src-spec into index at dest-spec.

search index-spec query

Search registered packages in index-spec.

setup action

config [location]: Create configuration file at location. If not location specified use default (see --config).

index [location]: Setup an index at location specified in config.

repo: Setup a repository. The repository will be created at the location specified via the --repository option or default location specified by config.

update src-spec dest-spec

As for register.

upload path upload-spec

Upload a file or package at path to upload-spec. The upload-spec are of the form:


    upload-dest-id://BUCKET/LABEL

For example:


    ## default ckan upload
    ckan://BUCKET/LABEL


    ## an s3 upload destination
    my-s3://BUCKET/LABEL


    ## local pairtree
    my-pairtree://BUCKET/LABEL


    ## google storage
    my-google-storage://BUCKET/LABEL

Upload destinations are specified in your datapm config file and are of the form:


    [upload:dest-id]
    ofs.backend = s3|google|archive.org|...
    ## see OFS documentation for a given backend
    config-option = config-value

OPTIONS

--version

show program's version number and exit

-h, --help

show this help message and exit

-v, --verbose

Give more output

-d, --debug

Print debug output

-q, --quiet

Give less output

--log=FILENAME

Log file where a complete (maximum verbosity) record will be kept

-c CONFIG, --config=CONFIG

Path to config file (if any) - defaults to $HOME/.dpmrc

-r REPOSITORY, --repository=REPOSITORY

Path to repository - overrides value in config

-k API_KEY, --api-key=API_KEY

CKAN API Key (overrides value in config)

CONFIGURATION FILE


 [dpm]
 repo.default_path = $HOME/.dpm/repository
 index.default = file


 [index:ckan]
 ckan.url = http://thedatahub.org/api/
 ckan.api_key = 


 [index:db]
 db.dburi = sqlite://$HOME/.datapm/repository/index.db


 [upload:ckan]
 ofs.backend = reststore
 host = http://storage.ckan.net

FILES

~/.dpmrc
Per user datapm configuration file.

EXAMPLES

Grabbing some data from an index

    datapm index-add file:///....
    datapm update
    datapm search "military spending"
      some-id Military Spending 1890-1914
      some-id-2 Military Spending 1890-1914 (normalized)
    datapm install some-id
    datapm plot some-id

Get two different datasets and use them together

    datapm install pkg-a
    datapm install pkg-b
    datapm create merged
      # manual merge
      # e.g. PPP, GDP
    datapm register my-merged-package