java


Why does Stanford CoreNLP server split named entities into single tokens?


I'm using this command to post the data (a bit of copy pasta from the stanford site):
wget --post-data 'Barack Obama was President of the United States of America in 2016' 'localhost:9000/?properties={"annotators": "ner", "outputFormat": "json"}' -O out.json
The response looks like this:
{
"sentences": [{
"index": 0,
"tokens": [{
"index": 1,
"word": "Barack",
"originalText": "Barack",
"lemma": "Barack",
"characterOffsetBegin": 0,
"characterOffsetEnd": 6,
"pos": "NNP",
"ner": "PERSON",
"before": "",
"after": " "
}, {
"index": 2,
"word": "Obama",
"originalText": "Obama",
"lemma": "Obama",
"characterOffsetBegin": 7,
"characterOffsetEnd": 12,
"pos": "NNP",
"ner": "PERSON",
"before": " ",
"after": " "
}, {
"index": 3,
"word": "was",
"originalText": "was",
"lemma": "be",
"characterOffsetBegin": 13,
"characterOffsetEnd": 16,
"pos": "VBD",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 4,
"word": "President",
"originalText": "President",
"lemma": "President",
"characterOffsetBegin": 17,
"characterOffsetEnd": 26,
"pos": "NNP",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 5,
"word": "of",
"originalText": "of",
"lemma": "of",
"characterOffsetBegin": 27,
"characterOffsetEnd": 29,
"pos": "IN",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 6,
"word": "the",
"originalText": "the",
"lemma": "the",
"characterOffsetBegin": 30,
"characterOffsetEnd": 33,
"pos": "DT",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 7,
"word": "United",
"originalText": "United",
"lemma": "United",
"characterOffsetBegin": 34,
"characterOffsetEnd": 40,
"pos": "NNP",
"ner": "LOCATION",
"before": " ",
"after": " "
}, {
"index": 8,
"word": "States",
"originalText": "States",
"lemma": "States",
"characterOffsetBegin": 41,
"characterOffsetEnd": 47,
"pos": "NNPS",
"ner": "LOCATION",
"before": " ",
"after": " "
}, {
"index": 9,
"word": "of",
"originalText": "of",
"lemma": "of",
"characterOffsetBegin": 48,
"characterOffsetEnd": 50,
"pos": "IN",
"ner": "LOCATION",
"before": " ",
"after": " "
}, {
"index": 10,
"word": "America",
"originalText": "America",
"lemma": "America",
"characterOffsetBegin": 51,
"characterOffsetEnd": 58,
"pos": "NNP",
"ner": "LOCATION",
"before": " ",
"after": " "
}, {
"index": 11,
"word": "in",
"originalText": "in",
"lemma": "in",
"characterOffsetBegin": 59,
"characterOffsetEnd": 61,
"pos": "IN",
"ner": "O",
"before": " ",
"after": " "
}, {
"index": 12,
"word": "2016",
"originalText": "2016",
"lemma": "2016",
"characterOffsetBegin": 62,
"characterOffsetEnd": 66,
"pos": "CD",
"ner": "DATE",
"normalizedNER": "2016",
"before": " ",
"after": "",
"timex": {
"tid": "t1",
"type": "DATE",
"value": "2016"
}
}]
}]
}
Am I doing something wrong? I have Java client code that would at least recognize Barack Obama and United States of America as full NERs, but using the service it seems to treat each token separately. Any ideas why?
You should add the entitymentions annotator to your list of annotators.

Related Links

My Android application is working very slowly
Caching google map v2 markers bitmaps
Using filledcircle and pixmap in libgdx
Execute function from model in viewcontroller
What happened to DigestRealmBase in Glassfish 4.0?
How to dynamically control the for-loop nested level?
Cypher Query for friends of friends filtering out those who are already friends
Why does a process run but not show?
Thread Safety of Java I/O PrintStream
When using the JAXB_FRAGMENT property, do you need to output the XML Declaration?
Java custom checkerboard with nested loop
How to do a deployment pipeline with Maven
How to access Windows password and userid?
Method getUserPrincipal in Websockets implementation for Glassfish is not working
Edit jTable and save results to XML
outputing custom csv header in reducer of map reduce

Categories

HOME
cakephp
keras
urbancode
plot
relative-path
dot
cvs
twitter-bootstrap-4
portia
multiple-records
pheatmap
cross-validation
synchronization
virtualization
here-api
offline
postgres-xl
zebra-printers
ups
pc
windows-7-x64
reactive-cocoa
flux
dbext
autocad-plugin
abi
excel-2007
pass-by-reference
trading
java-7
angular2-aot
numerical-methods
vlsi
traffic
filezilla
subdomains
c++-amp
react-chartjs
google-sites-2016
grails-3.1
pim
avcapturesession
mozilla
mapbox-gl
mapdb
html5-fullscreen
picasso
x11-forwarding
hue
android-fingerprint-api
broadcastreceiver
arena-simulation
chain-builder
zip4j
jquery-nestable
fancybox-2
yii2-extension
slickedit
linode
topbeat
abcpdf9
deadbolt-2
instant
gridpane
impresspages
blackberry-10
pickadate
reactive-banana
android-listview
operation
fluid-dynamics
system32
unity5.2.3
proj4js
device-orientation
geonetwork
modalpopup
jsonpickle
cdt
fpml
html-helper
ember-charts
mbr
pyhdf
rabl
comaddin
ember-app-kit
factory-method
specification-pattern
adk
boost-filesystem
calling-convention
winbugs14
icanhaz.js
modelstate
bespin
perfect-hash
swing-app-framework
phonon
zend-decorators
defensive-programming

Resources

Encrypt Message