Google Summer of Code Proposal - Revamp GetTor

Below is my proposal for the 2014 GSoC.

== What project would you like to work on? ==

I'm interested in the Revamp Gettor project. I believe GetTor has a huge impact on avoiding censorship, and therefore it could be expanded to do a lot more of what it does now. I propose a redesign of the current code, oriented to modules, which the users can access via common services and get the required info. The idea is that this new design will allow the implementation of new modules in the future.

The idea I propose can be explained in the following points:

  1. A modular design, with a core module in charge of transmitting data to the others modules. Each module will be accessed via a different "service" (e.g. SMTP).

  2. The core module will be in charge of receiving requests from the other modules and answer them. For example, the SMTP module will say "Give me the URL links for TBB in this language" and the core module will answer with the desired links. These links could be retrieved from several locations, my proposal is that the links could be stored in a private git repository available only to trusted Tor developers, so the links and other data could vary with time according to the different censorship behaviours.

  3. As I said, the modules will each accessed via a different "service". I propose the following three for starting:

All three modules should be in the same server. The core module should provide formatted data in order to make it easy to create new modules that interact with it. Clean and scalable code are a must.

I see two options of interacting with the core module. The first one via system commands (modules will need to make a system call to get the info):

  $ gettor-core --pkg=TBB --lang=en --version=latest --type=URL --os=linux
  {"Status":"Success", "Content":"http://some.page1, https://some.page2"}

  $ gettor-core --pkg=TBB --lang=es --version=latest --type=URI --os=linux
  {"Status":"Success", "Content":"/home/tbb/latest/latest_es.tgz"}

  $ gettor-core --pkg=TBB --lang=jj --version=latest --type=URL --os=linux
  {"Status":"Fail", "Content":"Unsopported language"}

The second one as a Python module, similar to the current code:

  import gettor.core
  
  # receive and parse requests
  
  response = gettor.core.get(params)
  
  if response['status'] == 'success':
    # send data back to user
  else:
    # process error in response['content'] 

Parsing the requests to call the core module should be handled by each module, as well as flood prevention. Logging and stats should be handled by the core module.

Here is a (very) simple diagram of what I've tried to explain:

              -----------
           ->|SMTP Module|
         /    -----------  \                                       Trusted Tor developer
        /                   \                                      /  
       /     ------------    \       ------          --------   <-
USERS<----> |Skype Module|   <----> | CORE | <----> | GIT REPO|  <---- Trusted Tor developer
       \      -----------    /       ------          --------   <-
        \                   /         |           (Data source)   \  
         \     --------------         |                            \ Trusted Tor developer
          \-->|Twitter Module|        | 
               --------------         |
            \                         |
             \             ---------------
              \---------->| OTHER SERVICES|
                           --------------- 
 
  

The main idea is that this new design would allow for new modules to be implemented in the future.

Minor aspects and/or features I'd like to mention:

  1. Users could send formatted mails with their public PGP keys via the SMTP service, and the bot will answer them with encrypted content.

  2. Users could notify services with censored links, this means that if, for example, a user requests the links via SMTP and one of them has been blocked, he/she could send a message notifying this to the SMTP service. This info along with stats about which services are more popular will be saved in a log file (I think this is how is working now) and/or a database. The purpose of this data will be to enhance GetTor, no info about users will be recorded.

  3. Other option to storage could be Github (as you may know, repos can be downloaded as ZIP files).

  4. For all the above, GetTor will need official twitter/github accounts.

I think the best way to achieve this is to rewrite GetTor, but if you prefer to extend the current code, I'm okay with it. All of this will be done in Python.

** TIMELINE **

As I won't be working full-time on GSoC, I suggest to begin on April 21st, this way I could have the modules design and part of the documentation, to actually start coding on May19th.

April 21 - May 19
Modules design (interfaces, diagrams, etc.) and part of the documentation.

May 19 - June 19
Core module implementation, documentation and testing.

June 19 - July 19
SMTP refactoring and Twitter module implementation. Documentation and testing for both.

July 19 - August 19
Skype module implementation, documentation and testing.

I could easily setup the environment for testing the core, twitter and skype modules, I will need help for SMTP. I will continue to work on pending tasks after August 19 if needed. In any case, I'd be glad to keep collaborating with GetTor and the Tor project in general.


== Point us to a code sample ==

I've coded my own Perl Module, 'adopted' another one, coded a NSE script, some homeworks (I actually did a twitter bot in python a couple of years ago), and solved various problems related to competitive programming. All of this can be found on my Github repo. I've also uploaded the required patch for sending SHA1 Checksums


== Why Tor? ==

Because is one, if not _the_ most important project on anti-censorship and privacy. All Open Source projects have implications, some are for the benefits of developers, sysadmins, teachers, etc., but Tor is for the benefit of us all. Tor stands for the future we want for the internet and society.


== Experiences in free software development environments ==

I don't have much experience with free software development, but I do like to collaborate in technical projects for helping others. As for free software, I've taken part in the "adoption" campaign for Perl modules on CPAN, which main goal is to fix bugs and have co-maintainance of modules that have been "abandoned" and/or marked as available for adoption by the authors or the CPAN community. As for collaboration in general, I've been part of the technical committee of the Chilean Olympiad in Informatics for over a year, mainly configuring and setting up the contest management system (CMS) for major events and training camps, all of this as volunteer.


== Other commitments (a second job, classes, etc)? ==

As I live in the southern hemisphere, this means my semester begins on March and ends in July. I still don't know when I'll have my exams, but they should be sometime in mid April, mid June and mid July. Nevertheless, I've been working and studying for over a year so I'm used to it, and I've taken just a few courses. I work as a freelancer, so in case I'm accepted I'll consider this to be my job.


== Will your project need more work and/or maintenance after the summer ends?
What are the chances you will stick around and help out with that and other related projects? ==

The idea of the project is to that other modules could be integrated with GetTor, so it definetely will need more work. I'm not sure of how much maintenance will it need, but I will be glad to co-maintain if needed.

The chances of sticking around are very high, even though I appeared near the deadline, I'm very interested in the Tor project, and to be honest, it was very difficult to pick a single project to apply, so I'll definetely stick around and help out with GetTor and other Tor projects :-)


== What is your ideal approach to keeping everybody informed of your progress over the course of the project?
Said another way, how much of a "manager" will you need your mentor to be? ==

My idea is to send weekly reports to the mailing list (for mentors AND community feedback). Also, be connected on IRC when working on GetTor so I could discuss any questions or problems.


== What school are you attending? What year are you, and what's your major/degree/focus? ==

I'm a third year student, currently pursuing a Bachelor of Science in Computer Science at the University of Santiago, Chile. I also studied computer engineering for three years at Federico Santa Maria Technical University, but I dropped out.


== How can we contact you to ask you further questions? ==

You can write to israel.leiva@usach.cl. My IRC nickname is ilv, I connect almost every day (@oftc and/or @freenode). My twitter is @criptomante. My current time zone is UTC-3.


== Are you applying to other projects for GSoC? ==

Nope.

== Is there anything else that we should know that will make us like your project more? ==

As I said earlier, it was very difficult to choose one single idea to collaborate with, all of them are awesome! I really want to collaborate with the Tor project, wether I'm accepted at gsoc or not.

I've been programming since I was 15, and using Linux since I was 16. I used to be a web defacer (beware, the article is quite exaggerated about the real facts). I'm also very interested on competitive programming; I participated at the International Olympiad in Informatics in Croatia, and at the South America ACM-ICPC Regionals four times. I like to believe I have a creative mind.

I choose not to live in a dystopia.