Quantcast
Channel: security – Cloudera Engineering Blog
Browsing all 166 articles
Browse latest View live

Inside Apache HBase’s New Support for MOBs

Learn about the design decisions behind HBase’s new support for MOBs. Apache HBase is a distributed, scalable, performant, consistent key value database that can store a variety of binary data types....

View Article


Cloudera Navigator Encrypt Architecture: The Overview

Cloudera Navigator Encrypt is a key security feature in production-deployed enterprise data hubs. This post explains how it works. Cloudera Navigator Encrypt, which is integrated with Cloudera...

View Article


Thrift Client Authentication Support in Apache HBase 1.0

Thrift client authentication and doAs impersonation, introduced in HBase 1.0, provides more flexibility for your HBase installation. In the two-part blog series “How-to: Use the HBase Thrift Interface”...

View Article

How-to: Secure YARN Containers with Cloudera Navigator Encrypt

Learn how Cloudera Navigator Encrypt bring data security to YARN containers. With the introduction of transparent data encryption in HDFS, we are now a big step closer toward a secure platform in the...

View Article

What’s New in Cloudera Director 1.5?

Cloudera Director 1.5 is now available; this post describes what’s inside, including a new open source plugin interface. Cloudera Director is the manifestation of Cloudera’s commitment to providing a...

View Article


How-to: Run Apache Mesos on CDH

Big Industries, Cloudera systems integration and reseller partner for Belgium and Luxembourg, has developed an integration of Apache Mesos and CDH that can be deployed and managed through Cloudera...

View Article

Image may be NSFW.
Clik here to view.

Community Meetups at Strata + Hadoop World NYC 2015

Strata + Hadoop World 2015 NYC is more than a daytime conference; it’s also a nighttime meetup experience. (Plus, there are a bunch of book signings.) It won’t be long before we’re all in NYC for...

View Article

Meet Cloudera’s Apache Spark Committers

The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen,...

View Article


How-to: Prepare Unstructured Data in Impala for Analysis

Learn how to build an Impala table around data that comes from non-Impala, or even non-SQL, sources. As data pipelines start to include more aspects such as NoSQL or loosely specified schemas, you...

View Article


How-to: Index Scanned PDFs at Scale Using Fewer Than 50 Lines of Code

Learn how to use OCR tools, Apache Spark, and other Apache Hadoop components to process PDF images at scale. Optical character recognition (OCR) technologies have advanced significantly over the last...

View Article

Impala’s Next Step: Proposal to Join the Apache Software Foundation

The Impala project has already passed several important milestones on the way to its status as the leader and open standard for BI and SQL analytics on modern big data architecture. Today’s milestone...

View Article

New in Cloudera Enterprise 5.9: S3 Integration and SQL Editor Improvements

Cloudera Enterprise 5.9 includes the latest release of Hue (3.11), the web UI that makes Apache Hadoop easier to use. As part of Cloudera’s continuing investments in user experience and productivity,...

View Article

How to secure ‘Internet exposed’ Apache Hadoop

You may have heard of the recent (and ongoing) hacks targeting open source database solutions like MongoDB and Apache Hadoop. From what we know, an unknown number of hackers scanned for...

View Article


New in Cloudera Enterprise 5.10: Hue SQL Editor and Security Improvements

Cloudera Enterprise 5.10 includes the latest updates of Hue, the intelligent editor for SQL Developers and Analysts. As part of Cloudera’s continuing investments in user experience and productivity,...

View Article

Job Scheduling in Apache Hadoop

(guest blog post by Matei Zaharia) When Apache Hadoop started out, it was designed mainly for running large batch jobs such as web indexing and log mining. Users submitted jobs to a queue, and the...

View Article


Securing an Apache Hadoop Cluster Through a Gateway

(Added 6/4/2013) Please note the instructions below are deprecated. Please refer to the CDH4 Security Guide for up-to-date procedures. A few weeks ago we ran an Apache Hadoop hackathon. ApacheCon...

View Article

Image may be NSFW.
Clik here to view.

Configuration Parameters: What can you just ignore?

Configuring a Hadoop cluster is something akin to voodoo. There are a large number of variables in hadoop-default.xml that you can override in hadoop-site.xml. Some specify file paths on your system,...

View Article


High Energy Hadoop

We asked Brian Bockelman, a Post Doc Research Associate in the Computer Science & Engineering Department at the University of Nebraska–Lincoln, to tell us how Hadoop is being used to process the...

View Article

What’s New in Hadoop Core 0.20

Hadoop Core version 0.20.0 was released on April 22. In this post I will run through some of the larger or more significant user-facing changes since the first 0.19 release—there were 262 Jiras fixed...

View Article

Using Cloudera’s Hadoop AMIs to process EBS datasets on EC2

A while back, we noticed a blog post From Arun Jacob over at Evri (if you haven’t seen Evri before, it’s a pretty impressive take on search UI). We were particularly interested in helping Arun and...

View Article
Browsing all 166 articles
Browse latest View live