Download ajax-solr from-
https://github.com/evolvingweb/ajax-solr
Open ${Nutch_runtime_home}/conf/schema.xml-
Find : <field name=”content” type=”text” stored=”false” indexed=”true” />
Change to : <field name=”content” type=”text” stored=”true” indexed=”true”/>
Check the following properties in nutch-default.xml
<property> <name>fetcher.store.content</name> <value>true</value>
<description>If true, fetcher will store content.</description> </property>
<property> <name>parser.caching.forbidden.policy</name>
<value>content</value> <description>If a site (or a page) requests through its
robot metatags that it should not be shown as cached content, apply this
policy. Currently three keywords are recognized: "none" ignores any
"noarchive" directives. "content" doesn't show the content, but shows
summaries (snippets). "all" doesn't show either content or
summaries.</description> </property>
Open {Ajax-Solr}examples/reuters/js/reuter.js
Change
solrUrl: 'http://reuters-demo.tree.ewdev.ca:9090/reuters/'
to
solrUrl: 'http://localhost:8983/solr/collection1/'
Change
var fields = [ 'topics', 'organisations', 'exchanges' ];
to
var fields = [ 'title', 'url', 'content'];
Change
'facet.field': [ 'topics', 'organisations', 'exchanges', 'countryCodes' ],
to
'facet.field': [ 'title'],
Change
fields: [ 'topics', 'organisations', 'exchanges' ]
To
fields: [ 'title', 'url', 'content']
Change
id: 'text',
to
id: 'content',
Delete
'f.topics.facet.limit': 50,
'f.countryCodes.facet.limit': -1,
'facet.date': 'date',
'facet.date.start': '1987-02-26T00:00:00.000Z/DAY',
'facet.date.end': '1987-10-20T00:00:00.000Z/DAY+1DAY',
'facet.date.gap': '+1DAY',
Open {Ajax-Solr}examples/reuters/widgets/ResultWidget.js
Change
snippet += doc.dateline + ' ' + doc.text.substring(0, 300);
To
snippet += doc.content.substring(0, 300);
Change
snippet += '<span style="display:none;">' + doc.text.substring(300);
To
snippet += '<span style="display:none;">' + doc.content.substring(300);
Change
snippet += doc.dateline + ' ' + doc.text;
To
snippet += doc.content;
open examples/reuters/index.html in a browser
https://github.com/evolvingweb/ajax-solr
Open ${Nutch_runtime_home}/conf/schema.xml-
Find : <field name=”content” type=”text” stored=”false” indexed=”true” />
Change to : <field name=”content” type=”text” stored=”true” indexed=”true”/>
Check the following properties in nutch-default.xml
<property> <name>fetcher.store.content</name> <value>true</value>
<description>If true, fetcher will store content.</description> </property>
<property> <name>parser.caching.forbidden.policy</name>
<value>content</value> <description>If a site (or a page) requests through its
robot metatags that it should not be shown as cached content, apply this
policy. Currently three keywords are recognized: "none" ignores any
"noarchive" directives. "content" doesn't show the content, but shows
summaries (snippets). "all" doesn't show either content or
summaries.</description> </property>
Open {Ajax-Solr}examples/reuters/js/reuter.js
Change
solrUrl: 'http://reuters-demo.tree.ewdev.ca:9090/reuters/'
to
solrUrl: 'http://localhost:8983/solr/collection1/'
Change
var fields = [ 'topics', 'organisations', 'exchanges' ];
to
var fields = [ 'title', 'url', 'content'];
Change
'facet.field': [ 'topics', 'organisations', 'exchanges', 'countryCodes' ],
to
'facet.field': [ 'title'],
Change
fields: [ 'topics', 'organisations', 'exchanges' ]
To
fields: [ 'title', 'url', 'content']
Change
id: 'text',
to
id: 'content',
Delete
'f.topics.facet.limit': 50,
'f.countryCodes.facet.limit': -1,
'facet.date': 'date',
'facet.date.start': '1987-02-26T00:00:00.000Z/DAY',
'facet.date.end': '1987-10-20T00:00:00.000Z/DAY+1DAY',
'facet.date.gap': '+1DAY',
Open {Ajax-Solr}examples/reuters/widgets/ResultWidget.js
Change
snippet += doc.dateline + ' ' + doc.text.substring(0, 300);
To
snippet += doc.content.substring(0, 300);
Change
snippet += '<span style="display:none;">' + doc.text.substring(300);
To
snippet += '<span style="display:none;">' + doc.content.substring(300);
Change
snippet += doc.dateline + ' ' + doc.text;
To
snippet += doc.content;
open examples/reuters/index.html in a browser