Introduction
============
Debpartial-mirror is a tool used to perform manipulations on debian repositories
generating other repositories. The manipulations are :
   * Pruning : creating a subset of a repository based on file structure:
      - architecture  (eg i386, amd64, ...)
	  - distributions (eg etch, lenny, ...)
	  - components (eg main, contrib, ...)

   * Filtering : selecting a subset of packages based on various criteria:
      - package name
	  - priority (eg optional, extra, ...)
	  - subsection (eg gnome, kde, games, ...)
	  - arbitary package fields (tags, ...)
	  - debian-cd taskfiles

   * Dependency resolution
   * Merging : combining repositories
   * Signing

The input repositories can be local or remote (via http(s), ftp(s))
They must be "automatic" ie have the structure
   dists/dist-1/Release
               /component1/binary-arch1/Packages
                                       /Packages.gz
                                       /Release

The follwing variants are supported
   * Package pools
   * Non pooled repositories (such as those created by debarchiver)
   * Signed repositories (Release.gpg file)

For the moment the following are NOT supported (planned for next release)
   * Trivial (flat) repositories
   * Source packages

As debpartial-mirror only works on repositories if you have some locally built
.deb files you will need some other tool to make them into a repository that
can be processed by debpartial-mirror. Packages to look at for this include:
   * debarchiver
   * reprepro


Configuration file structure
============================
The default configuration file is /etc/debpartial-mirror.conf however other
files may be used with the -c command line argument.

The configuration file has the syntax:

[GLOBAL]
global configration...

[some repository]
configuration for some repository
...

[another repository]
configuration for another repository


IE a global section and one section per repository.
The global section MUST contain the following keys:

mirror_dir
   base directory for the generated repositories
   each repository is stored as a subdirectory under here

architectures
   space seperated list of architectures to keep (pruning)

components
   space seperated list of components to keep  (pruning)

distributions
   space seperated list of distributions to keep (pruning)

get_suggests
get_recommends
get_provides
get_sources,
get_packages
   true / false  (NOT YET IMPLEMENTED)

In addition it MAY contain the following keys:
standalone
   true => error will be raised if all dependencies are not resolved
   false (default) => allow repositories with unresolved dependencies

debug
   true / false


All of these keys (except debug) may be overridden on a per repository basis.


Input repositories
==================
A remote repository is specified by a http or ftp URL pointing to the repository
root (where the dists directory lives):

[debian]
server = http://ftp.debian.org/debian

A local repository is specified using a file URL:

[my-stuff]
server = file:///var/lib/custom-packages

Note that even for local repositories the (possibly pruned or filtered) files
are COPIED into the directory managed by debpartial-mirror (in contrast to
merging _between_ repositories already managed by debpartial-mirror where links
are used).
This is because the file URL could possibly point to removable storage etc.
An option to enable links for local repositories may be added in the future to
avoid wasting disk space where the input repository is permenant.


Pruning
=======
This is done using the architectures, distributions, components keys (either
in the GLOBAL section or the per repository section) :

[etch]
server = http://ftp.debian.org/debian
architectures = i386 amd64
distributions = etch
componnets = main

There will be no Pacakges files for the excluded architectures, distributions
and componants. The master Release file will however still list the excluded
parts (since it is signed and thus cannot be modified)


Filtering
=========
Filtering is done using the "filter" key in a repository section and alows
the set of included packages to be reduced.

Filter takes a space seperated list of filter-type:filter-value pairs.
Where multiple pairs are specified the resulting package list is the logical
OR of the packages accepted by each pair.

The allowed filter-types and the meaning of the associated filter-value are:

name
   regular expression matching a package name (from the start of the name)
   Eg  name:linux will match linux-2.6, linux-image but NOT syslinux
   wheras name:.*linux will match all three packages above
   See the python documentation for details of the regular expression syntax

subsection
   regular expression matching a package subsection (eg games, kde, ...)

priority
   regular expression matching a package priority (eg optional, extra, ...)

exclude-name, exclude-subsection, exclude-priority
   as above except that the matched packages are excluded

include-from
   a file in the debian-cd task list format listing packages to include

exclude-from
   a file in the debian-cd task list format listing packages to exclude

field-XXX
   regular expression matching a field in the package header. Most useful XXX
   is probably "Tag". The package is included if a match occurs.
   Note that, unlike name, subsection and priority indexes are not used so this
   may be quite inefficient for large repositories.

exclude-field-XXX
   regular expression matching a field in the package header. Most useful XXX
   is probably "Tag". The package is excluded if a match occurs.

Example
[my-repository]
filter = exclude-section:games|kde exclude-field-Tag:.*implemented-in::ruby

(No I don't have anything against ruby or kde really!)

A filtered repository only contains a subset of the original package files but
FULL index files (Packages, Package.gz etc) for all included architectures,
components and distributions. This is done to avoid breaking signatures but does
mean that a package search on such a repository will list packages that have
been removed (an attempt to install them will thus fail).

A future release may include an option to force index regeneration if desired
(and optionally resign with another key).


Dependency resolution
=====================
An attempt will always be made to resolve dependencies within a repository.
This means that even if a package would normally be excluded by a filter it
will still be included if it is required as a dependency of another package
accepted by the filter.

For example
[my-repository]
filter = name:apache exclude-name:perl

Will *include* the apache package and its dependencies (of which perl is one)

Currently only *hard* dependencies (ie the Depends: field in the package control
file) are considered. A later release will implement the get_suggests and
get_recommends configuration keys to extend this to softer dependencies.

If dependency resolution fails the result depends on the "standalone"
configuration key. If this is false (or not specified) a warning message is
printed but processing continues. This is reasonable since the repository
may well be an "extension" to be used in conjunction with other repostories
(typically the official Debian ones).

On the other hand if standalone = true an error occurs if all dependencies
cannot be resolved.

It is also possible to specify other repositories to use for dependency
resolution. For example

[etch]
...

[my-stuff]
server = file:///var/lib/custom-packages
filter = name:my-thing
resolve_deps_using = etch
standalone = true

Will take the package my-thing from the custom-packages local repository AND
all its dependencies, including those in etch to create a standalone repository

Note that this requires the index files to be regenerated (since the resulting
repository has packages that did not exist in custom-packages). This may have
implications for signing.

Merging
=======
Merging is the process of taking two (or more) repositories and combining them
into one. This is done using the "backends" configuration key rather than server

[etch-main]
server = http://ftp.debian.org/debian
distributions = etch
components = main

[custom]
server = file:///var/lib/custom-packages

[custom-etch-main]
backends = etch-main custom

Here custom-etch-main will contain all the packages from etch-main and custom

This requires index and release file regeneration.
The following keys may optionally be used in a merged repository to specify
the contents of the release file:
   origin
   label
   suite
   codename
   version
   description

It is also possible to filter while merging:

[mything-etch-main]
backends = etch-main custom
filter_custom = name:mything


Signing
=======
When the release and index files are regenerated, that is :
   * When merging
   * When using resolve_deps_using

The resulting repository may be signed using the "signature_key" configuration
key. This takes either "default" or any value accepted as a key id by gpg :

[myRepo]
backends = src1 src2
signature_key = 9B2DC6BF

[anotherRepo]
backends = src3 src4
signature_key = default

Obviously the key to be used must be available in the secret keyring of the
user running debpartial-mirror.

Running debpartial-mirror
=========================
Typical invocations:

debpartial-mirror all
   Run all mirror actions using default configuration file

debpartial-mirror -c myConfig.conf all
   Run all mirror actions using a specific configuation file


See manpage for more info.
