Opened 18 years ago

Last modified 22 months ago

#1062 in-progress enhancement

add copyright headers to googlefs; cleanup code.

Reported by: mmu_man Owned by: mmu_man
Priority: low Milestone: Unscheduled
Component: File Systems/GoogleFS Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

The code is messy; just a personal reminder. Don't even look at it :p

Change History (10)

comment:1 by mmu_man, 18 years ago

Status: newassigned

comment:2 by wkornewald, 18 years ago

I'm just not sure if we want to keep GoogleFS in our repository because of the legal issues it might have.

comment:3 by mmu_man, 18 years ago

Can always rename it... I don't think they fear our competition though :D

comment:4 by wkornewald, 18 years ago

I wasn't talking about the naming issue. The problem is that we are using their search results and changing the way they're displayed (we even remove the ads). AFAIK, they don't allow manipulating the resulting HTML. The search results must look exactly the same as in a web browser. At least, we discussed this on the admin list.

comment:5 by mmu_man, 18 years ago

the ad removal is just an unattended result of the simplistic html parser (it also skips non-html pages like PDF). I started looking at their XML API but it was restricted and they removed it anyway... so much for the altruistic period :) They probably can't do anything against the code itself (free speech) just for shipping bins... I don't think they ever sued over Firefox search side-bars which do parse the html, but I'll remove it if deemed necessary.

comment:6 by scottmc, 12 years ago

It's been 6 years, have you finished this yet? ;) is it even still needed/useful?

comment:7 by mmu_man, 12 years ago

Well yeah, I probably have a patch for that lying in the svn working copy I moved away when switching to git...

comment:8 by pulkomandy, 10 years ago

Milestone: R1Unscheduled

comment:9 by pulkomandy, 3 years ago

Current status:

  • Smap violations (I fixed some)
  • The current Google result page is way too complicated to parse
  • I tired switching to html.duckduckgo.com which is a lot simpler to parse, but requires https, and I don't think we want to implement that in the kernel?

So, if this has any future, it should be moved to userlandfs.

And the "legal" issue still stands, it's not clear if we're allowed to use the web search results in this way. Some info about this here: https://duckduckgo.com/api which explains why DDG can't provide an API for search results (basically it's because they outsource the search to other companies).

comment:10 by marcoapc, 22 months ago

There is this project that deploys Google File System partially in C++, this one under apache license: https://github.com/Gan-Tu/cppGFS2.0

The closest thing to the GFS is this project, also under apache license: https://github.com/quantcast/qfs

I believe this project under MIT license is more suitable for Haiku: http://ori.scs.stanford.edu/

Note: See TracTickets for help on using tickets.