How-to: Deploy Apache Hadoop Clusters Like a Boss
Learn how to set up a Hadoop cluster in a way that maximizes successful production-ization of Hadoop and minimizes ongoing, long-term adjustments. Previously, we published some recommendations on...
View ArticleNew in CDH 5.3: Apache Sentry Integration with HDFS
Starting in CDH 5.3, Apache Sentry integration with HDFS saves admins a lot of work by centralizing access control permissions across components that utilize HDFS. It’s been more than a year and a half...
View ArticleHow-to: Do Real-time Big Data Discovery using Cloudera Enterprise and Qlik Sense
Thanks to Jesus Centeno of Qlik for the post below about using Impala alongside Qlik Sense. Cloudera and Qlik (which is part of the Impala Accelerator Program) have revolutionized the delivery of...
View ArticleCouchdoop: Couchbase Meets Apache Hadoop
Thanks to Călin-Andrei Burloiu, Big Data Engineer at antivirus company Avira, and Radu Pastia, Senior Software Developer in the Big Data Team at Orange, for the guest post below about the Couchdoop...
View ArticleHow-to: Let Users Provision Apache Hadoop Clusters On-Demand
Providing Hadoop-as-a-Service to your internal users can be a major operational advantage. Cloudera Director (free to download and use) is designed for easy, on-demand provisioning of Apache Hadoop...
View ArticleHow Testing Supports Production-Ready Security in Cloudera Search
Security architecture is complex, but these testing strategies help Cloudera customers rely on production-ready results. Among other things, good security requires user authentication and that...
View ArticleConverting Apache Avro Data to Parquet Format in Apache Hadoop
Thanks to Big Data Solutions Architect Matthieu Lieber for allowing us to republish the post below. A customer of mine wants to take advantage of both worlds: work with his existing Apache Avro data,...
View ArticleHow-to: Quickly Configure Kerberos for Your Apache Hadoop Cluster
Use the scripts and screenshots below to configure a Kerberized cluster in minutes. Kerberos is the foundation of securing your Apache Hadoop cluster. With Kerberos enabled, user authentication is...
View ArticleSneak Preview: HBaseCon 2015 Operations Track
This year’s HBaseCon Operations track features some of the world’s largest and most impressive operators. In this post, I’ll give you a window into the HBaseCon 2015’s (May 7 in San Francisco)...
View ArticleText Mining with Impala
Thanks to Torsten Kilias and Alexander Löser of the Beuth University of Applied Sciences in Berlin for the following guest post about their INDREX project and its integration with Impala for integrated...
View ArticleHow-to: Get Started with CDH on OpenStack with Sahara
The recent OpenStack Kilo release adds many features to the Sahara project, which provides a simple means of provisioning an Apache Hadoop (or Spark) cluster on top of OpenStack. This how-to, from...
View ArticleSecurity, Hive-on-Spark, and Other Improvements in Apache Hive 1.2.0
Apache Hive 1.2.0, although not a major release, contains significant improvements. Recently, the Apache Hive community moved to a more frequent, incremental release schedule. So, a little while ago,...
View ArticleInside Apache HBase’s New Support for MOBs
Learn about the design decisions behind HBase’s new support for MOBs. Apache HBase is a distributed, scalable, performant, consistent key value database that can store a variety of binary data types....
View ArticleCloudera Navigator Encrypt Architecture: The Overview
Cloudera Navigator Encrypt is a key security feature in production-deployed enterprise data hubs. This post explains how it works. Cloudera Navigator Encrypt, which is integrated with Cloudera...
View ArticleThrift Client Authentication Support in Apache HBase 1.0
Thrift client authentication and doAs impersonation, introduced in HBase 1.0, provides more flexibility for your HBase installation. In the two-part blog series “How-to: Use the HBase Thrift Interface”...
View ArticleHow-to: Secure YARN Containers with Cloudera Navigator Encrypt
Learn how Cloudera Navigator Encrypt bring data security to YARN containers. With the introduction of transparent data encryption in HDFS, we are now a big step closer toward a secure platform in the...
View ArticleWhat’s New in Cloudera Director 2.5?
Cloudera Director 2.5 brings cluster auto-repair functionality and improved support for AWS Spot instances. Support for Cloudera Manager’s external account feature has been added along with S3Guard...
View ArticleCloudera SDX: Under the Hood
What is SDX? Shared Data Experience — SDX — is Cloudera’s secret ingredient that makes it possible to deploy Cloudera’s four core functions (Data Engineering, Data Science, Analytic DB, Operational DB)...
View ArticleWhat’s New in Cloudera Director 2.6
Cloudera Director 2.6 introduces support for protecting communications with TLS and SSH host keys. Azure support is enhanced with support for Azure Managed Disks and custom images.. Cloudera Director...
View ArticleGetting Started with Cloudera’s Cybersecurity Solution
A quick conversation with most Chief Information Security Officers (CISOs) reveals they understand they need to modernize their security architecture and the correct answer is to adopt a machine...
View Article