GOOGLE SUMMER OF CODE 2016

This page contains a list of project ideas which will be suitable for students taking part to Google Summer of Code 2016 or who simply want to help a cool open-source project to grow and learn a thing or two in the process 😉

AVAILABLE PROJECTS WITH MENTORS

1 – SXTAR – BACKUP / RESTORE TOOL USING AN SX BACKEND

Required skills: Python or C

Level: intermediate/advanced

As an object storage SX cannot be directly used for system backup. On the other hand SX provides a lot of very useful features which a
backup tool could make good use of: replication, rack-awareness, multiple revisions, encryption, file metadata support.

The sxtar tool shall:

1- Work on the command line in a very similar way to the tar program (ideally it would be a drop-in replacement)
2- Handle all file types supported by tar; handle ownership and POSIX attributes (extended attributes are a plus)
3- Make effective use of the block-level deduplication in SX (i.e. properly align stored objects)
4- Handle large number of small files efficiently (i.e. pack them together in larger SX objects in a way that doeesn’t conflict with #3)

The sxtar tool shall NOT:

1- Reinvent the wheel. In particular it shall not perform internal deduplication, revisioning, encryption, compression on its own.
Additionally it shall not have built-in incremental capabilities:
backups are always “full”. All of this shall happen automatically thanks to the features of SX (cluster and protocol)

For the sxtar tool the student can choose between C (using libsxclient[*]) and python (using sxclient[*]).
Note: working with the chosen library and possibly modifying it is integral part of this project.

2 – SXRSYNC – RSYNC LIKE TOOL USING THE SX PROTOCOL

Required skills: C

Level: intermediate

Our command line clients include sxcp, a tool which emulates scp and uses the SX protocol to upload/download data to/from the SX storage backend.

sxrsync is a tool which allows to synchronize the content of two volumes or directories recursively, similarly to what rsync does, but using the SX protocol.

sxrsync should start as a fork of sxcp and copycat the most important options available in rsync, for instance: –delete-before, –delete-after, –update, –append, –perms, –dry-run, –existing, –ignore-existing, –force-delete, –exclude, –exclude-from

3 – SX AND LIBRES3 RECIPES/PLAYBOOKS FOR PUPPET, CHEF AND ANSIBLE

Required skills: C

Level: easy

There are multiple configuration management tools for automatic configuration. They are different ways of helping DevOps configuring and controlling large data centers with thousands of machines.

The different configuration management tools differ in their API, how configuration is stored and the configuration DSL.

We already have an Ansible playbook to deploy Skylable on physical and virtual machines. The goal of his project is to create similar recipes for Chef and Puppet.

The recipes should be generic enough to be used in conjunction with Docker containers, and public clouds like Google Cloud Platform.

 4 – SX COMMAND LINE CLIENTS – SUPPORT FOR MULTIPLE FILTERS

Required skills: C

Level: intermediate

Filters  are  special  plug-ins for our SX clients. They run locally and perform specific operations when the client tools access the data on SX clusters. A use case example is  data  encryption, which is performed before the data gets uploaded to the cluster and after it gets downloaded to the local computer. We also have support for on-the-fly compression, storing file attributes, and object undelete.

Currently only one filter per volume is supported. The purpose of this project is to allow the use of multiple filters on a single volume.

5- SXLOCATE

Required skills: C

Level: easy

This tool emulates the behaviour of the ‘locate’ command: it searches for the specified pattern on all SX volumes available to the user, and returns the full URLs of matching files on stdout, one per line.…