java


Why does Stanford CoreNLP server split named entities into single tokens?


I'm using this command to post the data (a bit of copy pasta from the stanford site):
wget --post-data 'Barack Obama was President of the United States of America in 2016' 'localhost:9000/?properties={"annotators": "ner", "outputFormat": "json"}' -O out.json
The response looks like this:
{
"sentences": [{
"index": 0,
"tokens": [{
"index": 1,
"word": "Barack",
"originalText": "Barack",
"lemma": "Barack",
"characterOffsetBegin": 0,
"characterOffsetEnd": 6,
"pos": "NNP",
"ner": "PERSON",
"before": "",
"after": " "
}, {
"index": 2,
"word": "Obama",
"originalText": "Obama",
"lemma": "Obama",
"characterOffsetBegin": 7,
"characterOffsetEnd": 12,
"pos": "NNP",
"ner": "PERSON",
"before": " ",
"after": " "
}, {
"index": 3,
"word": "was",
"originalText": "was",
"lemma": "be",
"characterOffsetBegin": 13,
"characterOffsetEnd": 16,
"pos": "VBD",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 4,
"word": "President",
"originalText": "President",
"lemma": "President",
"characterOffsetBegin": 17,
"characterOffsetEnd": 26,
"pos": "NNP",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 5,
"word": "of",
"originalText": "of",
"lemma": "of",
"characterOffsetBegin": 27,
"characterOffsetEnd": 29,
"pos": "IN",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 6,
"word": "the",
"originalText": "the",
"lemma": "the",
"characterOffsetBegin": 30,
"characterOffsetEnd": 33,
"pos": "DT",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 7,
"word": "United",
"originalText": "United",
"lemma": "United",
"characterOffsetBegin": 34,
"characterOffsetEnd": 40,
"pos": "NNP",
"ner": "LOCATION",
"before": " ",
"after": " "
}, {
"index": 8,
"word": "States",
"originalText": "States",
"lemma": "States",
"characterOffsetBegin": 41,
"characterOffsetEnd": 47,
"pos": "NNPS",
"ner": "LOCATION",
"before": " ",
"after": " "
}, {
"index": 9,
"word": "of",
"originalText": "of",
"lemma": "of",
"characterOffsetBegin": 48,
"characterOffsetEnd": 50,
"pos": "IN",
"ner": "LOCATION",
"before": " ",
"after": " "
}, {
"index": 10,
"word": "America",
"originalText": "America",
"lemma": "America",
"characterOffsetBegin": 51,
"characterOffsetEnd": 58,
"pos": "NNP",
"ner": "LOCATION",
"before": " ",
"after": " "
}, {
"index": 11,
"word": "in",
"originalText": "in",
"lemma": "in",
"characterOffsetBegin": 59,
"characterOffsetEnd": 61,
"pos": "IN",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 12,
"word": "2016",
"originalText": "2016",
"lemma": "2016",
"characterOffsetBegin": 62,
"characterOffsetEnd": 66,
"pos": "CD",
"ner": "DATE",
"normalizedNER": "2016",
"before": " ",
"after": "",
"timex": {
"tid": "t1",
"type": "DATE",
"value": "2016"
}
}]
}]
}
Am I doing something wrong? I have Java client code that would at least recognize Barack Obama and United States of America as full NERs, but using the service it seems to treat each token separately. Any ideas why?
You should add the entitymentions annotator to your list of annotators.

Related Links

Inter domain cookies in Tomcat 7 via XML config
Java microservice foreign key relationship
Ebean ( incremental column by another column )
Running .jar file using Apache POI in Maven project
Selenium tests with dockers or without
Getting a variable from a class
Rotation of sprite, libgdx android
java.lang.IllegalArgumentException while deserializing JSON through GSON
Creating a Gender field [closed]
HashMap print value only if certain key is selected
Is there a way to get an index of a node in DOM?
How do I output the text input and combobox on JList (java)
Trouble extracting values from JSON object in Java
Error in binary search Java
Elasticsearch Java API from 2.x to 5.x issues
How append a row line to an existing csv file using opencsv in java

Categories

HOME
vim
gitlab
gremlin
reserved
view
office365api
q
cvs
analysis
spring-jdbc
umd
webpack-2
opengl-es-2.0
convolution
esper
windows-10-universal
gnupg
kentor-authservices
row
postgres-xl
usergrid
dynamics-crm-online
designer
lombok
reactcsstransitiongroup
jprofiler
web-sql
tibco-mdm
devops
sqlcipher
fish
intel-pin
react-css-modules
neo4j-spatial
socialengine
stormpath
karaf
google-api-nodejs-client
amazon-kinesis-kpl
fusionpbx
mixture-model
c11
jna
ruby-on-rails-3.1
http-live-streaming
s
scorm
jsch
atomicity
clean-architecture
dotnetzip
yii2-extension
lowpass-filter
csound
tactic
libusb-win32
namecoin
sqlbulkcopy
angular-cache
plottable.js
holder.js
persist
design-by-contract
associative-array
cloudbees
icu4j
security-testing
android-listview
cakephp-3.1
angular-local-storage
directoryservices
geonetwork
qpainter
knuth
android-imagebutton
article
tld
starcluster
uitouch
p4java
fluentautomation
android-2.2-froyo
mbr
centos5
oam
tidy
sitemesh
apc
coverflow
dmoz
hosts-file
semantic-diff
locationlistener
inotifycollectionchanged
visitor-statistic
blitz++
perfect-hash
exchange-server-2003
uimenucontroller

Resources

Database Users
RDBMS discuss
Database Dev&Adm
javascript
java
csharp
php
android
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App