Results 1 to 5 of 5

Thread: How to best use a JBOD enclosure

Hybrid View

  1. #1

    How to best use a JBOD enclosure

    One of the researchers at my organization was gifted a JBOD enclosure populated with sixty 10TB drives. He was also given two 480GB SSD's for use with the enclosure.

    To this point, all of my large-scale storage experience has been with SAN arrays or NAS enclosures.
    Since this enclosure does not have the intelligence of either of those devices, I'm trying to figure out the best way to utilize the enclosure.

    I've had the researcher purchase a server with a Xeon Gold 5122 CPU and 256GB of RAM to serve as the front end for the enclosure.

    The question is, what software to use?

    The SSD's were provided with the intent of using them as a cache for the enclosure.

    As far as I can tell, the only way to provide the caching mechanism with an enclosure that does not have a built-in mechanism seems to be using the enclosure with a software-defined storage product, like SUSE Enterprise storage

    The enclosure will only be used by a half dozen researchers to store large images and the output of the system that will be used to analyze those images.

    Given that the data will be 1-5 TB files, and that most of the activity will involve writing the files, as opposed to frequent file reads, I'm not sure that the cache is needed.

    In addition, the multiple servers needed to provide SDS with storage nodes and monitor nodes seems to be an expensive, overly complex solution for this particular situation.

    I've also considered FreeNAS, since it has the ability to utilize the SSD's for write caching, but saw a fair amount of discussion over how dependable the ZFS file system is.
    The fact that the server has ECC memory addresses at least one issue with ZFS and its ARC method of write caching.

    My original intention was to simply set up the server as a SLES 11/12 server, use software RAID to administrate the storage and set it up as an SMB share.
    I wanted to make sure, however, that I'm not missing an alternative solution that might be a better fit for the hardware.

    Any thoughts are welcome.
    Last edited by gathagan; 25-Apr-2018 at 20:40.

  2. #2

    Re: How to best use a JBOD enclosure

    gathagan wrote:

    >
    > One of the researchers at my organization was gifted a JBOD enclosure
    > populated with sixty 10TB drives. He was also given two 480GB SSD's
    > for use with the enclosure.


    Wow! 600TB of raw storage...

    > To this point, all of my large-scale storage experience has been with
    > SAN arrays or NAS enclosures.
    > Since this enclosure does not have the intelligence of either of those
    > devices, I'm trying to figure out the best way to utilize the
    > enclosure.
    >
    > I've had the researcher purchase a server with a Xeon Gold 5122 CPU
    > and 256GB of RAM to serve as the front end for the enclosure.


    That sounds like overkill just to interface with your storage but it
    may be appropriate if you have a significant amount of processing.


    > The question is, what software to use?
    >
    > The SSD's were provided with the intent of using them as a cache for
    > the enclosure.
    >
    > As far as I can tell, the only way to provide the caching mechanism
    > with an enclosure that does not have a built-in mechanism seems to be
    > using the enclosure with a software-defined storage product, like SUSE
    > Enterprise storage


    SUSE Enterprise storage (or Ceph) would not be appropriate. These
    solutions make use of multiple systems providing both storage and
    processing capabilities.


    > The enclosure will only be used by a half dozen researchers to store
    > large images and the output of the system that will be used to analyze
    > those images.
    >
    > Given that the data will be 1-5 TB files, and that most of the
    > activity will involve writing the files, as opposed to frequent file
    > reads, I'm not sure that the cache is needed.


    If that is indeed the case and if the two SSDs are the only available
    cache, then I suspect the SSDs would be of limited value. However the
    disk/RAID controller in the server may provide some caching options.

    > In addition, the multiple servers needed to provide SDS with storage
    > nodes and monitor nodes seems to be an expensive, overly complex
    > solution for this particular situation.


    Agreed!

    > I've also considered FreeNAS, since it has the ability to utilize the
    > SSD's for write caching, but saw a fair amount of discussion over how
    > dependable the ZFS file system is.
    > The fact that the server has ECC memory addresses at least one issue
    > with ZFS and its ARC method of write caching.
    >
    > My original intention was to simply set up the server as a SLES 11/12
    > server, use software RAID to administrate the storage and set it up as
    > an SMB share.
    > I wanted to make sure, however, that I'm not missing an alternative
    > solution that might be a better fit for the hardware.
    >
    > Any thoughts are welcome.


    A bit more information would be helpful:
    - Do you have the make/model of the enclosure?
    - How does it interface to the server? Are there multiple interfaces?
    - What kind of drives do you have: SAS/SATA? Speed? Model? etc...

    Assuming you want redundant storage, RAID-5 would not be an appropriate
    solution because of the size of the array and because of the "write"
    overhead. More info is needed to determine a more appropriate solution.

    --
    Kevin Boyle - Knowledge Partner
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below this post.
    Thank you.

  3. #3

    Re: How to best use a JBOD enclosure

    kboyle wrote:

    Wow! 600TB of raw storage...

    That sounds like overkill just to interface with your storage but it
    may be appropriate if you have a significant amount of processing.

    SUSE Enterprise storage (or Ceph) would not be appropriate. These
    solutions make use of multiple systems providing both storage and
    processing capabilities.


    If that is indeed the case and if the two SSDs are the only available
    cache, then I suspect the SSDs would be of limited value. However the
    disk/RAID controller in the server may provide some caching options.

    Agreed!
    A bit more information would be helpful:
    - Do you have the make/model of the enclosure?
    - How does it interface to the server? Are there multiple interfaces?
    - What kind of drives do you have: SAS/SATA? Speed? Model? etc...

    Assuming you want redundant storage, RAID-5 would not be an appropriate
    solution because of the size of the array and because of the "write"
    overhead. More info is needed to determine a more appropriate solution.



    Thanks for the quick reply, Kevin

    The enclosure is an HGST 4U60.
    HGST markets it as a "Storage Platform".
    It's at the low end of the first generation product line, so there's just one IO module with one In/Out pair of Mini-SAS ports.
    The drives are their Ultrastar HE10 SATA drives (Gb/s 7,200 rpm 256MB Cache).

    I purposely over-spec'd the server to avoid running into limitations if it is re-purposed somewhere down the line.
    Most of the controllers that HGST qualifies for use with the enclosure are not RAID controllers, and the few that are RAID controllers are only qualified on one or two OS's.
    My intention is to handle the configuration via software, so the controller is a non-RAID SAS controller (LSI 9300-8e) without any caching mechanism.


    The researcher will be using an NVIDIA DGX-1 to run analysis on large image files.
    Those image files, and the data sets produced by the analysis will be stored on this enclosure.
    As such, it's my understanding that the major desire for the enclosure is capacity with as much resilience as feasible, given the hardware limitations.

    The capacity loss that RAID 10 entails makes it unattractive, since speed is not the top priority.

    My initial thought was to create a large RAID60 container comprised of ten RAID 6sub-arrays; each sub-array comprised of six drives.
    If the software allowed for using the SSD's as a write cache, I would put them in the server, instead of the enclosure.

    At this point, the researcher is still in the process of acquiring the DGX-1, so I have time to change directions if it makes more sense to handle the storage with a RAID controller or to take a different approach.

  4. #4

    Re: How to best use a JBOD enclosure

    gathagan wrote:

    > Thanks for the quick reply, Kevin
    >
    > The enclosure is an HGST 4U60.
    > HGST markets it as a "Storage Platform".
    > It's at the low end of the first generation product line, so there's
    > just one IO module with one In/Out pair of Mini-SAS ports.


    Well, it is what it is. I've looked at the specs and better understand
    what you're up against. ;-)

    > The drives are their Ultrastar HE10 SATA drives (Gb/s 7,200 rpm 256MB
    > Cache).


    That's SATA 6Gb/s with an Error rate (non-recoverable, bits read) of 1
    in 10**15.


    > I purposely over-spec'd the server to avoid running into limitations
    > if it is re-purposed somewhere down the line.
    > Most of the controllers that HGST qualifies for use with the
    > enclosure are not RAID controllers, and the few that are RAID
    > controllers are only qualified on one or two OS's.
    > My intention is to handle the configuration via software, so the
    > controller is a non-RAID SAS controller (LSI 9300-8e) without any
    > caching mechanism.


    Now that I know a bit more about the HGST 4U60, that decision makes
    sense.


    > The researcher will be using an NVIDIA DGX-1 to run analysis on large
    > image files.
    > Those image files, and the data sets produced by the analysis will be
    > stored on this enclosure.
    > As such, it's my understanding that the major desire for the enclosure
    > is capacity with as much resilience as feasible, given the hardware
    > limitations.


    Again, that appears to be a very reasonable approach however you may
    need to give some additional consideration to the phrase "with as much
    resilience as feasible".


    > The capacity loss that RAID 10 entails makes it unattractive, since
    > speed is not the top priority.


    Yes, RAID 10 overhead is its least attractive "feature". Certainly
    Write performance is much better than parity RAID but the big advantage
    RAID 10 offers is the recovery time when a drive fails. The use of SATA
    drives doesn't help the matter.


    > My initial thought was to create a large RAID60 container comprised of
    > ten RAID 6sub-arrays; each sub-array comprised of six drives.
    > If the software allowed for using the SSD's as a write cache, I would
    > put them in the server, instead of the enclosure.


    Again, this appears to be a reasonable compromise: overhead is "only"
    33 percent but how long is your array rebuild time when you lose a
    drive? To rebuild you would need to read 4 x 10 TB and write 10 TB.
    That's 50 TB and the drive specs state the typical Sustained Transfer
    Rate is 225 MB/s or 61 hours. Of course we all know it's going to take
    a lot longer than that and it's unlikely your server will be able to
    accommodate much useful work while the array is rebuilding.

    So, as long as you are aware and this is acceptable, I think you may
    have part of your solution.

    >
    > At this point, the researcher is still in the process of acquiring the
    > DGX-1, so I have time to change directions if it makes more sense to
    > handle the storage with a RAID controller or to take a different
    > approach.


    The next thing to consider is how the storage will be used. I assume
    you are planning to use software RAID? I like software RAID but how are
    you going to consolidate 10 x 40 GB RAID pools? LVM is one option but
    there are (or were in earlier releases) restrictions when mixing LVM
    with software RAID. If this is what you are considering, you should
    verify that what you are planning is workable.

    So these are some of my thoughts. Anything new here?

    --
    Kevin Boyle - Knowledge Partner
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below this post.
    Thank you.

  5. #5

    Re: How to best use a JBOD enclosure

    gathagan wrote:

    > I've also considered FreeNAS, since it has the ability to utilize the
    > SSD's for write caching, but saw a fair amount of discussion over how
    > dependable the ZFS file system is.


    Your other posts suggest you will use your server to manage your
    storage *and* process your data. With such a large volume of data and
    the need to have your server manage several separate arrays, perhaps it
    is safer to dedicate a server to managing storage?

    FreeNAS is one option but there are many others. It may be worthwhile
    taking a look at some of them and see if there is a fit between what
    they offer, what you need, and the HGST 4U60 you have available.

    Here's a fairly recent list of some. I'm sure you can find others.
    OPEN SOURCE STORAGE SOLUTIONS
    https://thesanguy.com/2017/07/06/ope...age-solutions/

    I recognise quite a few names and know a bit about some of them but I
    haven't looked at them in detail to determine which ones might be
    appropriate candidates. I thought you might enjoy that...

    --
    Kevin Boyle - Knowledge Partner
    If you find this post helpful and are logged into the web interface,
    please show your appreciation and click on the star below this post.
    Thank you.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •