Skip to content

trim conceptnet assertions to only english-language edges

Notifications You must be signed in to change notification settings

lostfictions/conceptnet-trim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

conceptnet-trim

trim conceptnet's ~34,000,000 multilingual assertions (about 10gb of tsv) into a tidy ~3,400,000 english-language assertions (in json format).

  1. clone this repo
  2. download the latest version of conceptnet (5.7.0 at the time of writing)
  3. extract it to data/assertions.csv in the root of this repo
  4. run cargo run -r to run in release mode. the trimmed assertions will be written to data/trimmed.json.

or download a pre-trimmed file from the releases page.

About

trim conceptnet assertions to only english-language edges

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages