X
  • geschäftsfelder
    • data
    • software
    • cloud
    • services
  • ressourcen
    • datenblätter
    • informationen
    • fachartikel
    • cinethek
  • hersteller
  • support
  • unternehmen
    • historie
    • karriere
    • veranstaltungen & webinare
    • kontakt
PDF generieren

StrongLink iRODS Comparison

StrongLink iRODS Comparison

Table of contents

  1. High-Level iRODS Comparison
  2. Simplicity
  3. Metadata exibility
  4. StrongLink Was Designed to Be Simple
  5. iRODS Scalability Challenges
  6. StrongLink Is Designed to Support Extreme Scalability
  7. Contact and further information

High-Level iRODS Comparison

On the surface, it may appear that iRODS is comparable to StrongLink as a data management platform. Phrases like Data Virtualization, Data Discovery, Work ow Automation, and Secure Collaboration are used by both. However there are fundamental differences between the architectures of the two systems.

Below are outlined some of the key differences between the iRODS and StrongLink. This is not an exhaustive list, and re ects our opinions based upon publicly available information and feedback from customers who have used iRODS, or considered implementing it in their environments.

Simplicity

  • iRODs is not a shrink-wrapped product. It is a collection of components that must be assembled by highly skilled technical staff at the user environment.
  • iRODS does not include a database: Customers must provide their own: (only relational databases are supported — Oracle, MySQL, PostGreSQL)° This means customers must have trained database administrators (DBAs) on staff to manage con guration, and any changes to its single, monolithic metadata schema.° What about NoSQL databases?
  • Although the iRODS consortium has talked for several years about addressing this problem, there are no current plans to adopt schema-less databases such as NoSQL offerings.

Metadata exibility

  • iRODS claims it has a “Metadata Catalog” which describes every object and storage resource in the system.
  • The key word here is “Catalog”. This is akin to a structured table of contents, that is established when everything is set up.
    • The problem is this is a monolithic relational database schema that must be managed by one or more DBAs,. and which cannot be altered without taking down the entire system.
    • There is no way for average users to dynamically add metadata schemas of their own, to facilitate work ows discovery, etc..
  • All set-up is done in the command line. Any changes to the environment interrupts user access, and require signi cant System Administrator or DBA involvement.
  • Any iRODS installation is a “roll-your-own” operation. All hardware must be sized, sourced, and integrated by the end-user’s technical staff.
  • Although they talk about collaboration. But there is no concept of a Global Namespace.
    • Every iRODS zone has its own namespace and is its own administrative entity.
    • If you have three separate “silos” of storage, you still have three separate namespaces to manage; one for each storage system.
  • iRODS was built out of a code base from 1997 called SRB, (Storage Resource Broker). It was built before modern distributed systems or schema-less databases were even conceived.
  • SDS customers who’ve tried to implement iRODS report that the degree of technical integration and ongoing maintenance could lead the costs to grow more than expected.

StrongLink Was Designed to Be Simple

  • StrongLink comes with everything needed, fully integrated. This includes all databases, con guration, and also a single-pane-of-glass management interface for all storage, namespaces, data, and metadata.
  • StrongLink is designed to be used by any non-technical user, and managed without trained database professionals. Non-technical users can add their own metadata types, perform complex multi-level queries, etc. without IT intervention or the need to learn a query language.
  • StrongLink has a self-healing, no-single-point-of-failure architecture. There is no single master head node, as there is with iRODS. All functions are distributed across all nodes, such that any node in the constellation can fail and there is no impact to users or operations.
  • StrongLink includes an AI (arti cial intelligence) engine that can detect usage patterns, suggest policy revisions based upon real activity across the system. This capability increases the ease of use for administrators, particularly in extremely large or distributed environments.
  • Built in reporting capabilities automatically generate utilization and other reports across any or all storage types. These may be customized by system administrators.

iRODS Scalability Challenges

  • Architecture:
    • In a few key iRODS installations that describe extreme scale data footprints, a signi cant amount of custom development by the end user organization has been done to overcome the signi cant scaling limitations inherent in the iRODS architecture.
    • The iRODS consortium suggests a couple of ways to address database scalability, including using external load balancing or other tricks to overcome the problem of a central head node.
    • Each of these involves more database customization and signi cant customer development effort, and don’t account for the additional hardware and integration efforts needed to scale out.
    • Queries:
      • Relational databases are good at structuring a xed schema of recurring metadata. But unstructured multi-argument queries across multiple metadata types and millions or especially billions of les will bring any relational database to its knees. It is precisely these types of queries that are needed to support a wide range of unstructured data.
        • iRODS queries are complex to build, and require deep database expertise.
        • The time to result is signi cant for complex queries in iRODS, resulting from the multiple table look-ups and joins that are required.
      • iRODS only supports a single relational database, which creates a bottleneck for performance, but also a major single point of failure.
      • The iRODS consortium does recommend some techniques to replicate the database, relying on multiple instances and load balancers. But that results in signi cantly more complexity, and additional technical expertise to implement, not to mention the added cost.
      • iRODS does not track or copy metadata when les are moved between zones.

StrongLink Is Designed to Support Extreme Scalability

  • StrongLink includes multiple databases, each tuned for speci c functions. This enables back-end system functions to operate independently of the metadata database, to ensure maximum performance and the best user experience.
  • As noted above, the StrongLink system is architected as a distributed system with no central head-node or single point of failure. New nodes can be added without interrupting operations. Software upgrades can roll across the nodes one by one in sequence, so users never experience down time even as the system is upgraded.
  • The StrongLink Metadata database:
    • All metadata of any kind is aggregated into a powerful schema-less NoSQL database. There is no limit to the number of metadata types that can be aggregated into the system. Unlike with iRODS, the StrongLink metadata database can be modi ed, extended, or import new schemas without taking the system of ine, or interrupting user access.
    • Multiple metadata types include all le system metadata (Access time, modify time, name, etc.), but also other types of rich metadata that might be included in the le header. This is of particular importance for imaging les that contain key information in their headers.
    • In addition, the metadata system can import metadata that might be trapped in external sources: this may include external les such as a CSV index, output from another database, and so on.
    • These otherwise incompatible metadta types are automatically aggregated in StrongLink, without the need for a DBA, so that any metadata fragment from any source can trigger work ow or automatic data migration, and be the basis for a query.
    • To further support extreme scalability, the StrongLink metadata engine is sharded, or distributed in small chunks, across all available nodes, which are each further replicated up to three times again for a high level of resilience. This is a well-proven technique used by modern database techologies.
      • This parallelization of the system means StrongLink can scale out linearly to accommodate any size environment, and handle any I/O requirements.
      • The system can sustain multiple node failures without data loss or user interruption because of this powerful resilience feature.
  • Data protection, provenance, versioning.
    StrongLink also assures data integrity by automatically running at least two checksums against each le. Dual checksums using robust SHA-512 (Sh-2) and MD5 standards ensure there is no risk of hash collisions, which could result in accidental deletions, or other problems. While hash collisions are rare in smaller environments, in environments that scale to 10s or 100s of billions of objects, the risk is much higher. With StrongLink, metadata is globally searchable, and can be used for complex queries, and intelligent, policy based data movement and management.

Contact and further information

Here you can find more information about StrongLink. Use our „StrongLink request“ form to request more information from us. We are very happy to answer your questions.

Complexity is the enemy of storage ROI
StrongLink Cloud Gateway
StrongLink Data mobility and tiering
StrongLink Datasheet
StrongLink GDPR Solution Brief
StrongLink Infographic Cognitive Data Management
StrongLink Infographic Storage Management
StrongLink iRODS Comparison
StrongLink Isilon Solution Brief
StrongLink LTFS FAQ
StrongLink request
Why StrongLink?
impressum datenschutzerklärung sitemap kontakt