JAAS Big Data - Whats missing?

Question’s in the title, I have my own opinions but I’m interested in getting more ideas.

Starter for 10

  • Authentication
  • Juju Storage Support in Hadoop charms

What else?

  • juju storage (you already said it, but it’s important enough to say it again)
  • juju network spaces (bdx raised an issue with lots more detail)
  • Figure out the top X knobs that people turn in hdfs/yarn/core-site.xml and make sure bigtop and/or the charms expose them, eg:
    • probably need to at least expose some of the kerberos options from hdfs-site.xml to support your auth feature
    • bdx made a request for exposing mem/cpu config in yarn-site.xml

Config MGMT

  • Since there are many possible configs, a more generic custom-config charm option that would allow a user to add/override a set of configs that make it into yarn-site.xml/hdfs-site.xml/core-site.xml might be better then trying to 1:1 match the configs.

  • +1 for custom configuration through the charms - this is the primary concern users coming from cloudera ask me about Juju BigData “How do I configure the components?”

  • Cloudera exposes a subset of common configs for each component of the stack in the cloudera manager gui, it would be cool if the charms would allow a user to modify any config using a custom-config option.

Storage

  • CephFS support
  • Juju storage bindings for HDFS

Network Spaces

  • This should come alongside the storage bits for obvious reasons (I don’t want data traveling over my mgmt interface DUH).

  • Cloudera forces a license and charges for support for multi-network feature (even though its just a single configuration)

  • Easy selling point for Juju BigData over cloudera once these are implemented- (network support, decoupled storage support)