This is the Java wrapping og the soaplab service Freeling_it, developed in PANACEA.
Current Software version 1.0, released on 15/04/2019. It is available at CNR-ILC github
ILC4CLARIN Freeling service is (for this version available) only for Italian:
Offered service performs the same operations (tokenization, pos tagging and lemmatization), but, according with the endpoints, a valid
can be produced.The service manages plain texts (TXT), TCF and KAF documents, but the incoming mimetype must be set accordingly.
This page explains how to invoke the offered services.
POST endpoints are the following:
freeling_it/runservice
(POST service to analyze plain texts and to produce TCF, TAB, KAF valid documents, according to the format parameter)freeling_it/tcf/runservice
(POST service to analyze TCF documents and to produce TCF, TAB, KAF valid documents, according to the format parameter)freeling_it/kaf/runservice
(POST service to analyze KAF documents and to produce TCF, TAB, KAF valid documents, according to the format parameter)Similarly GET endpoints have been set up for eventual integration into LRS
freeling_it/lrs
(GET service to analyze plain texts and to produce produce TCF, TAB, KAF valid documents, according to the format parameter)freeling_it/kaf/lrs
(GET service to analyze KAF documents and to produce produce TCF, TAB, KAF valid documents, according to the format parameter)freeling_it/tcf/lrs
(GET service to analyze TCF documents and to produce produce TCF, TAB, KAF valid documents, according to the format parameter)format
parameter must be supplied as parameters:
CURL:
Some examples: curl -H 'content-type: text/plain' --data-binary @plain-file.txt -X POST freeling_it/runservice?format=tab
curl -H 'content-type: text/plain' --data-binary @plain-file.txt -X POST freeling_it/runservice?format=tcf
curl -H 'content-type: text/plain' --data-binary @plain-file.txt -X POST freeling_it/runservice?format=kaf
curl -H 'content-type: text/tcf+xml' --data-binary @tcf-file.xml -X POST freeling_it/tcf/runservice?format=tab
curl -H 'content-type: text/tcf+xml' --data-binary @tcf-file.xml -X POST freeling_it/tcf/runservice?format=tcf
curl -H 'content-type: text/tcf+xml' --data-binary @tcf-file.xml -X POST freeling_it/tcf/runservice?format=kaf
curl -H 'content-type: text/xml' --data-binary @kaf-file.xml -X POST freeling_it/kaf/runservice?format=tab
curl -H 'content-type: text/xml' --data-binary @kaf-file.xml -X POST freeling_it/kaf/runservice?format=tcf
curl -H 'content-type: text/xml' --data-binary @kaf-file.xml -X POST freeling_it/kaf/runservice?format=kaf
WGET
Similarly, using wget: wget --header 'content-type: text/plain' --post-file=plain-file.txt -X POST freeling_it/runservice?format=tab
wget --header 'content-type: text/plain' --post-file=plain-file.txt -X POST freeling_it/runservice?format=tcf
wget --header 'content-type: text/plain' --post-file=plain-file.txt -X POST freeling_it/runservice?format=kaf
wget --header 'content-type: text/tcf+xml' --post-file=tcf-file.xml -X POST freeling_it/tcf/runservice?format=tab
wget --header 'content-type: text/tcf+xml' --post-file=tcf-file.xml -X POST freeling_it/tcf/runservice?format=tcf
wget --header 'content-type: text/tcf+xml' --post-file=tcf-file.xml -X POST freeling_it/tcf/runservice?format=kaf
wget --header 'content-type: text/xml' --post-file=kaf-file.xml -X POST freeling_it/kaf/runservice?format=tab
wget --header 'content-type: text/xml' --post-file=kaf-file.xml -X POST freeling_it/kaf/runservice?format=tcf
wget --header 'content-type: text/xml' --post-file=kaf-file.xml -X POST freeling_it/kaf/runservice?format=kaf
url
parameter indicates the URL where the text to analyze is found. The language
and the format
must be supplied as parameters:
CURL
Some examples:WGET
Some examples:As for plain text you can use
Mi chiamo Riccardo. Abito a Roma
As for TCF text you can use
<?xml version="1.0" encoding="UTF-8"?> <?xml-model href="http://de.clarin.eu/images/weblicht-tutorials/resources/tcf-04/schemas/latest/d-spin_0_4.rnc" type="application/relax-ng-compact-syntax"?> <D-Spin xmlns="http://www.dspin.de/data" version="0.4"> <md:MetaData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:cmd="http://www.clarin.eu/cmd/" xmlns:md="http://www.dspin.de/data/metadata" xsi:schemaLocation="http://www.clarin.eu/cmd/ http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1320657629623/xsd"> </md:MetaData> <tc:TextCorpus xmlns:tc="http://www.dspin.de/data/textcorpus" lang="it"> <tc:text> Mi chiamo Alfredo. Abito a Roma. </tc:text> </tc:TextCorpus> </D-Spin>
As for KAF text you can use
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <KAF xml:lang="it" version="1.0"> <kafHeader> <fileDesc /> <linguisticProcessors layer="text"> <lp name="it.cnr.ilc.panacea.service.impl.FreelingIt" version="1.0" timestamp="2019-04-12T15:04:32.096Z"/> </linguisticProcessors> </kafHeader> <text> <wf wid="w1" sent="1" para="1" offset="0" length="2"><![CDATA[Mi]]></wf> <wf wid="w2" sent="1" para="1" offset="3" length="6"><![CDATA[chiamo]]></wf> <wf wid="w3" sent="1" para="1" offset="10" length="8"><![CDATA[Riccardo]]></wf> <wf wid="w4" sent="1" para="1" offset="18" length="1"><![CDATA[.]]></wf> <wf wid="w5" sent="1" para="1" offset="19" length="5"><![CDATA[Abito]]></wf> <wf wid="w6" sent="1" para="1" offset="25" length="1"><![CDATA[a]]></wf> <wf wid="w8" sent="2" para="1" offset="27" length="4"><![CDATA[Roma]]></wf> </text> </KAF>
As URL you may use:
https://raw.githubusercontent.com/clarin-eric/LRS-Hackathon/master/samples/resources/txt/hermes-it.txt