Galaxy Consulting
  • Home
  • About Us
    • Our Process
    • Meet Us at Industry Events
  • Services
    • Business Analysis and Usability
    • Content and Knowledge Management
    • Records Management
    • Information Architecture
    • Enterprise Search
    • Taxonomy and Metadata Development and Management
    • Document Control
    • Information Governance
  • Solutions
    • Information Overload
    • Compliance
    • E-Discovery
    • Internal and External Websites
    • Enterprise Search
    • Collaboration and New Employees’ Onboarding
    • Customer Service
    • Manual Processes
    • Vulnerability of Sensitive Information
  • Portfolio
    • Our Brochure
    • Our Clients
    • Case Studies
    • Presentations
    • Press Releases >
      • Galaxy Consulting Receives 2016 Best of Redwood City Award
      • Galaxy Consulting Receives 2015 Best of Redwood City Award
    • Videos
  • Testimonials
  • Blog
  • Free Consultation
  • Contact Us
  • Terms of Use/Privacy Policy

Hadoop Adoption

6/12/2016

0 Comments

 
Picture
True to its iconic logo, Hadoop is still very much the elephant in the room. Many organizations heard of it, yet relatively few can say they have a firm grasp on what the technology can do for their business, and even fewer have actually implemented it successfully at their organization.

Forrester Research predicted that Hadoop will become a cornerstone of the business technology agenda at most organizations. 

Scalability, affordability, and flexibility make Hadoop uniquely suited to change the big data scene. An open-source software framework, Hadoop allows for the processing of big data sets across clusters on commodity hardware either on-premises or in the cloud. 

At roughly one-thirtieth the cost of traditional data storage and processing, Hadoop makes it realistic and cost effective to analyze all data instead of just a data sample. Its open-source architecture enables data scientists and developers to build on top of it to form customized connectors or integrations.

Typically, data analysis requires some level of data preparation, such as data cleansing and eliminating errors, outside of traditional data warehouses. Once the data is prepared, it is transferred to a high-performance analytics tool, such as a Teradata data warehouse. With data stored in Hadoop, however, users can see "instant ROI" by moving the data workloads off of Teradata and running analytics right where the data resides.

Other use of Hadoop is for live archiving. Instead of backing up data and storing it in a data recovery system, such as Iron Mountain, users can store everything in Hadoop and easily pull it up whenever necessary.

The greatest power of Hadoop lies in its ability to house and process data that couldn't be analyzed in the past due to its volume and unstructured form. Hadoop can parse emails and other unstructured feedback to reveal similar insight.

The sheer volume of data that businesses can store on Hadoop changes the level of analytics and insight that users can expect. Because it allows users to analyze all data and not just a segment or sample, the results can better anticipate customer engagement. Hadoop is surpassing model analytics that can describe certain patterns and is now delivering full data set analytics that can predict future behavior.

There are few challenges.

Hadoop's ability to process massive amounts of data, for example, is both a blessing and a curse. Because it's designed to handle large data loads relatively quickly, the system runs in batch mode, meaning it processes massive amounts of data at once, rather than looking at smaller segments in real time. As a result, the system often forces users to choose between quantity and quality. At this point in Hadoop's life cycle, the focus is more on enormous data size than high-performance analytics.

Because of the large size of the data sets fed into Hadoop, the number-crunching doesn't take place in real time. This is problematic because as the time between when you input the data and the time at which you have to make a decision based on that data grows, the effectiveness of that data decreases.

The biggest problem of all is that Hadoop's seeming boundlessness instills a proclivity for data exploration in those who use it. Relying on Hadoop to deliver all the answers without asking the right questions is inefficient.

As companies begin to recognize Hadoop's potential, demand is increasing, and vendors are actively developing solutions that promise to painlessly transfer data onto Hadoop, improve its processing performance, and operationalize data to make it more business-ready.

Big data integration vendor Talend, for example, offers solutions that help organizations transition their data onto Hadoop in high volume. The company works with more than 800 connectors that link up to other data systems and warehouses to "pull data out, put it into Hadoop, and transform it into a shape that you can run analytics on.

While solutions such as those offered by Talend make the Hadoop migration more manageable for companies, vendors such as MapR tackle the batch-processing lag. MapR developed a solution that enhances the Hadoop data platform to make it behave like enterprise storage. It enables Hadoop to be accessed as easily as network-attached storage is accessed through the network file system; this means faster data management and system administration without having to move any data. 

Veteran data solution vendors such as Oracle are innovating as well, developing platforms that make Hadoop easier to use and to incorporate into existing data infrastructures. Its latest updates revolved around allowing users to store and analyze structured and unstructured data together and giving users a set of tools to visualize data and find data patterns or problems.

RapidMiner's approach to Hadoop has been to simplify it, eliminate the need for end users to code, and do for Hadoop analytics what Wordpress did for Web site building. Once usable insights are collected, RapidMiner can connect the data platform to a marketing automation system or other digital experience management system to deploy campaigns or make changes based on data predictions.

Moving forward, analysts predict that leveraging Hadoop's potential will become a more attainable goal for companies. Because it's open-source, the possibilities are vast. Hadoop's ability to connect openly to other systems and solutions will increase adoption in the coming months and years.

0 Comments

Your comment will be posted after it is approved.


Leave a Reply.

    Archives

    April 2022
    March 2022
    January 2022
    July 2021
    May 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    July 2020
    April 2020
    March 2020
    December 2019
    November 2019
    September 2019
    August 2019
    July 2019
    May 2019
    March 2019
    February 2019
    January 2019
    December 2018
    October 2018
    July 2018
    June 2018
    May 2018
    March 2018
    February 2018
    January 2018
    December 2017
    September 2017
    July 2017
    June 2017
    May 2017
    April 2017
    January 2017
    December 2016
    November 2016
    September 2016
    July 2016
    June 2016
    May 2016
    April 2016
    March 2016
    February 2016
    January 2016
    December 2015
    November 2015
    October 2015
    September 2015
    July 2015
    June 2015
    May 2015
    April 2015
    March 2015
    February 2015
    January 2015
    December 2014
    November 2014
    October 2014
    September 2014
    August 2014
    July 2014
    June 2014
    May 2014
    April 2014
    March 2014
    February 2014
    January 2014
    December 2013
    November 2013
    October 2013
    September 2013
    July 2013
    June 2013
    May 2013
    April 2013
    March 2013
    February 2013
    January 2013
    December 2012
    November 2012
    October 2012
    September 2012
    August 2012
    July 2012
    June 2012
    May 2012
    April 2012
    March 2012
    February 2012
    January 2012
    December 2011
    November 2011

    Categories

    All
    Alfresco
    Arena
    Automatic Classification
    Autonomy
    Big Data
    Business Analysis
    Case Studies
    Change Control
    Change Management
    Cloud Content Management
    Cloud Ecm
    Cloud Enterprise Content Management
    Cms
    Collaboration
    Compliance
    Concept Searching
    Confluence
    Content Analysis
    Content Localization
    Content Management
    Content Management Systems
    Content Strategy
    Controlled Vocabulary
    Coveo
    Crisis Management
    Dams
    Data Integrity
    Data Security
    Digital Asset Management
    Digital Asset Management System
    Digital Transformation
    Dita
    Document Control
    Document Control Systems
    Documents Management
    Documentum
    Drupal
    Dublin Core Metadata
    Ecm
    E Discovery
    Engineering Change Process
    Enterprise Content Management
    Enterprise Search
    ERoom
    E-Signature
    Exalead
    Fatwire
    Gamification
    Gmp
    Gxp
    Hadoop
    Information Architecture
    Information Governance
    Information Overload
    Information Technology
    Iso 9001
    IT Systems Validation
    Joomla
    Knowledge Management
    Knowledge Management Applications
    Metadata
    Mobile Devices
    Naming Conventions
    Ontology
    Open Source Cms
    Open Text
    Oracle
    OWL
    Personalization
    RDF
    Records Management
    Risk
    Search Applications
    Self Service
    SEO
    Sharepoint
    Social Media
    Structured Content
    Taxonomy
    Teamsite
    Thesaurus
    Tridion
    Twiki
    Unified Data
    Usability
    User Adoption
    User Centered Design
    Vasont
    Vivisimo
    Web Site Content
    Web Site Design
    Wiki

    RSS Feed

Powered by Create your own unique website with customizable templates.