• Home
  • Archive
  • Tools
  • Contact Us

The Customize Windows

Technology Journal

  • Cloud Computing
  • Computer
  • Digital Photography
  • Windows 7
  • Archive
  • Cloud Computing
  • Virtualization
  • Computer and Internet
  • Digital Photography
  • Android
  • Sysadmin
  • Electronics
  • Big Data
  • Virtualization
  • Downloads
  • Web Development
  • Apple
  • Android
Advertisement
You are here: Home » Configure Apache Tika With WordPress to Search, Get Meta of PDF/Doc Files

By Abhishek Ghosh June 19, 2018 11:08 pm Updated on June 19, 2018

Configure Apache Tika With WordPress to Search, Get Meta of PDF/Doc Files

Advertisement

In our previously published article How to Install Apache Tika on Ubuntu Server, we learned basic about Apache Tika. Apache Tika Can Be Combined With PHP. Apache Tika can detect content, and extracts metadata and text from different file types – it can identify more than 1400 file types. Tika has relation with Apache Nutch codebase. Tika has fork in Python too. Tika has different way of implementation on server to integrate with various blogging platforms and CMS (including WordPress). Here is How to Configure Apache Tika With WordPress to Search, Get Meta of PDF/Doc/Excel/Text and Other Type of Files. This is another example of integration of Big Data tool with WordPress. Other examples of integration of Big Data tool with WordPress is combining search functions. We have article on Apache Solr vs. Elasticsearch For WordPress Search. Apache Nutch, Apache Tika practically are part of search, crawl and both for other purpose can be combined with Apache Solr. However, for using Apache Tika with WordPress, we do not need to go through Apache Solr – we want some function just within WordPress Admin.

Configure Apache Tika With WordPress to Search, Get Meta of PDF-Doc Files

 

How to Configure Apache Tika With WordPress

 

Difficult part for the new users was installing Apache Tika part, thinking of this article’s relatively new users; we written that Apache Tika installation guide slightly detailed. Essentially as first step one need to install that Apache Tika on same server WordPress is running. Obviously, Tika can be ran on separate server but configuring for separate server installation of Tika by new user may be difficult.

Apart from installing Apache Tika, WordPress will need two plugins to be installed. One is Search Everything :

Advertisement

---

Vim
1
https://wordpress.org/plugins/search-everything/

Second one is another WordPress plugin named Masala :

Vim
1
https://github.com/nanodust/masala

Masala means spice. Indian Masala are quite popular in America! Tikka means small piece of meat, fish etc. Together is Tikka Masala and whole earth is aware of what is cicken tikka butter masala. Apache projects deliberately named with various Sanskrit, Buddhist words to avoid copyright matters, make funny etc. Apache Tika is Tikka’s Tika – it is a delicious piece for Apache Solr.

Configure Apace Tika for your needed file types – check it whether can extract metadata on commandline. Thereafter install the plugin and check the source code of plugin. The plugin needs to install Tika’s jar somewhere on your server and assumes that you have Java installed on your server where WordPress running. Apache Tika’s jar file should be at project’s root folder and configure path in masala.php file. The plugin actually has not much detailed documentation.

When you upload content like a PDF or DOC, it will process the file after upload and insert metadata. You can
search the attachment’s metadata, obviously attachment will be listed in search results.

If you are using Apache Solr for WordPress search, itself metadata will be searchable, so as in most search engines.

This Article Has Been Shared 628 Times!

Facebook Twitter Pinterest
Abhishek Ghosh

About Abhishek Ghosh

Abhishek Ghosh is a Businessman, Orthopaedic Surgeon, Author and Blogger. You can keep touch with him on Twitter - @AbhishekCTRL.

Here’s what we’ve got for you which might like :

Articles Related to Configure Apache Tika With WordPress to Search, Get Meta of PDF/Doc Files

  • How to Include Jupyter Notebooks in WordPress Posts

    How to Include Jupyter Notebooks in WordPress Posts? There Are Several Ways. It Depends on What Exactly Your Blogs Are About.

  • Visualization of SQL Data in Jupyter Notebook & Embedding in WordPress Post

    Here is How To On Visualization of SQL Data in Jupyter Notebook & Embedding in WordPress Post in Easy Language, in All Steps.

  • Adding Schema.org Structured Data in Genesis Theme

    Adding Schema.org Structured Data in Genesis Theme without using any plugin is quite easy as Genesis has great documentation and has excellent existing markups.

  • WordPress Lost Post Recovery Options on Cloud

    WordPress Lost Post Recovery Options Are Not Less in Number on Cloud, Even Without Backup Failure. If Your FTP Server is Running, Data Can Be Recovered.

  • WordPress Interactive Python Widget For Python (Like JSFiddle For Jupyter Notebook)

    Here is WordPress Interactive Python Widget For Python With DataCamp Light For Peoples Who Hunt Something Like JSFiddle For Jupyter Notebook.

Additionally, performing a search on this website can help you. Also, we have YouTube Videos.

Take The Conversation Further ...

We'd love to know your thoughts on this article.
Meet the Author over on Twitter to join the conversation right now!

If you want to Advertise on our Article or want a Sponsored Article, you are invited to Contact us.

Contact Us

Subscribe To Our Free Newsletter

You can subscribe to our Free Once a Day, Regular Newsletter by clicking the subscribe button below.

Click To Subscribe

Please Confirm the Subscription When Approval Email Will Arrive in Your Email Inbox as Second Step.

Search this website…

 

Popular Articles

Our Homepage is best place to find popular articles!

Here Are Some Good to Read Articles :

  • Cloud Computing Service Models
  • What is Cloud Computing?
  • Cloud Computing and Social Networks in Mobile Space
  • ARM Processor Architecture
  • What Camera Mode to Choose
  • Indispensable MySQL queries for custom fields in WordPress
  • Windows 7 Speech Recognition Scripting Related Tutorials

Social Networks

  • Pinterest (20K Followers)
  • Twitter (4.9k Followers)
  • Facebook (5.8k Followers)
  • LinkedIn (3.7k Followers)
  • YouTube (1.2k Followers)
  • GitHub (Repository)
  • GitHub (Gists)
Looking to publish sponsored article on our website?

Contact us

Recent Posts

  • Components of Agile Software Development January 15, 2021
  • What is Conway’s Law? January 14, 2021
  • Effects of Digitization on Companies : Part XIII January 13, 2021
  • What is SoftAP Mode? January 12, 2021
  • The Most Important Payment Trends January 11, 2021

 

About This Article

Cite this article as: Abhishek Ghosh, "Configure Apache Tika With WordPress to Search, Get Meta of PDF/Doc Files," in The Customize Windows, June 19, 2018, January 15, 2021, https://thecustomizewindows.com/2018/06/configure-apache-tika-with-wordpress-to-search-get-meta-of-pdf-doc-files/.

Source:The Customize Windows, JiMA.in

 

This website uses cookies. If you do not want to allow us to use cookies and/or non-personalized Ads, kindly clear browser cookies after closing this webpage.

Read Cookie Policy.

PC users can consult Corrine Chorney for Security.

Want to know more about us? Read Notability and Mentions & Our Setup.

Copyright © 2021 - The Customize Windows | dESIGNed by The Customize Windows

Copyright  · Privacy Policy  · Advertising Policy  · Terms of Service  · Refund Policy