Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Storage and Alfresco
1. Storage Foundation and
Alfresco
Toni de la Fuente
Principal Solutions Engineer, Americas
toni.delafuente@alfresco.com
Blog: blyx.com – Twitter: @ToniBlyx
2. Agenda
• Intro to Storage Concepts
• Hardware
• Alfresco Storage Related Solutions
– Alfresco S3
• Caching contentstore
– Alfresco XAM
– Content Store Selector
– Replication / Geo-clusters / Redundancy
• Partners Solutions
– Alf2CAS, Star Storage
• Storage Best Practices with Alfresco
• Backup and Recovery
3. Intro to Storage Concepts: stack
File Protocol NFS, CIFS, SMB
File System Ext3, Ext4,
RaiserFS, XFS,
GFS, NTFS, FAT32,
GlusterFS, OCFS,
ZFS
Block Management MDM, LVM (Logical
Volume
Management)
Block Protocol SCSI, SATA, FC
RAID (HW or SW) Mirrors, Stripes
Hardware Disks, connectors,
racks, FC switches
4. Intro to Storage Concepts
• Hard drive types and interfaces
– PATA: Parallel Advanced Technology Attachment
• AKA IDE or EIDE, older, 20pin connector, less efficient, use
to be 4K – 5K rpm.
– SATA: Serial ATA
• Similar to PATA, different connector, more energy efficient,
between 5K and 10K rpm.
– SCSI: Small Computer System Interface
• Spin at 10K and 15K rpm, need a controller
– SSD: Solid State Drives
• No mechanical, semiconductors, much faster than
mechanical and less likely to break down than others.
5. Intro to Storage Concepts
• Hard drive types and interfaces
– FC: Fibre Channel
• Successor to parallel SCSI, broader usage than mere disk
interfaces, used for SANs.
– SAS: Serial Attached SCSI
• Similar to SCSI but serial rather than parallel.
– Other interfaces end user oriented:
• USB
• Firewire
• Thunderbolt
• CAS Content-addressable storage, is a mechanism for storing
information that can be retrieved based on its content, not its
storage location. (EMC Centera / Caringo)
• XAM standard interface for archiving in CAS.
6. Intro to Storage Concepts
• RAID types (SW or HW)
ß Faster with parity
7. Intro to Storage Concepts
Main differences between SAN and NAS
A SAN is a shared "network" of
storage
• Block access to LUNs
• Online and offline storage
• SAN device = storage array
• Zoning: data integrity and
security
• Dedicated fiber network
Protocols:
• SCSI over Fibre Channel
• SCSI over IP/Ethernet (iSCSI)
and FC, Infiniband
NAS is a file system shared over
a network
• File access to data
• Online storage only
• NAS device = File server or
"filer” already formatted
Protocols:
• NFS, CIFS over IP over
Ethernet
8. Intro to Storage Concepts
Who should need a SAN?
• Database servers and ECM: Oracle, SQL Server, DB2 and
other database servers.
• File servers: Using SAN-based storage for file servers lets
you expand file server resources quickly, makes them run
better, and enables you to manage your file-based NAS
storage through the SAN.
• Backup servers: SAN-based backup is dramatically faster
than LAN-based backup.
• Voice/video servers: Manage large amounts of data very
quickly.
• High-performance application servers: Applications such
as document management, customer relationship
management, billing, data warehouses, and other high-
performance and critical applications all benefit by what a
SAN can provide.
11. Alfresco Storage Related Solutions
Alfresco S3 Connector
• An alternative contentstore implementation that uses S3 directly (S3
APIs)
• Somewhat equivalent to XAM, but not identical
– Unlike XAM, S3 doesn’t offer retention policies
• Enterprise only
– USD10K for Alfresco Standard
– USD13.4K for Alfresco Enterprise
• Shipped as a single repo-side AMP
• Can only be installed into a new Alfresco instance (no migration!)
• Configuration must be done before first start.
• Can also configure caching content store (default cache size: 50GB)
• Only supported if Alfresco is running on Amazon EC2
• Amazon EBS still required for database files, indexes, etc.
• Does not support S3 Encryption yet.
12. Alfresco Storage Related Solutions
Alfresco XAM Connector (deprecated)
• Made to get access from Alfresco to XAM
enabled storage devices.
• New XAM connector available
• Only EMC Centera supported
• Released with 3.4, Jan 2011.
• Enterprise only
• Still being supported for existing customers
– until November 30th 2014 or their current subscription
runs out, whichever comes first.
13. Alfresco Storage Related Solutions
Content Store Selector
• Storage policies based in
business rules
• Since Alfresco 3.2
• Examples
o By type: Large video files on
fast expensive drives. Office
documents on slower, more
cost effective, drives.
o By business unit, by age, by
usage, by ...
• Leverage Rules and Actions
to drive
SSD
$$$
SATA
Drive
$
SSD = Solid State Drives
FC = Fibre Channel
Policy
Rules
Policy
Rules
FC
Drives
$$
14. Alfresco Storage Related Solutions
Content Replication (Alfresco on-premise to Alfresco on-
premise)
• Distributed repository replication
– Selective replication of spaces and content
– Support for full, incremental and delete
– One source – multiple destinations
– Replicas are read-only (update at source only - re-
direct if needed)
• Benefits
– Support geographically dispersed companies
– Provide fast local access
– Remove single point of failure
– Reduce wide area network traffic
15. Alfresco Storage Related Solutions
Content Replication / Geo-clusters / Redundancy
• Alfresco Cloud Sync: On premise ßà Cloud
– Content oriented not for storage replication
• Synchronization feature between Alfresco on-
premises (Not available yet).
• Alfresco Desktop Sync: from Windows or Mac
desktop to Alfresco on-premise (not available
yet)
16. Alfresco Storage Related Solutions
Geo-clusters and Redundancy
• Geo-clusters can be done by replicating DB and Content
store. Supported?
– Low level replication/sync
– Some customers has
this.
– Some customer uses NetApp
NAS storage and
Golden-gate for DB
replication
– Other replication
tools: EMC Clariion,
EMC Symmetrix or
IBM Total Storage.
18. Third Party – Community Solutions
• StorNext
– It is not a connector is a solution for data life cycle management in the
background
– Alfresco can see it as mount point and is not aware about that
– Runs over FC
• EMC Atmos
– XAM connector for Alfresco
• Alfresco Cloud Store
– Amazon S3
– https://code.google.com/p/alfresco-cloud-store/
• Amazon S3 for on premise
– https://issues.alfresco.com/jira/browse/AMZNSSS-26
• Walrus? The S3 alternative for Eucalyptus
19. Storage Best Practices
• Content Store
– Use Content Store Selector for managing different
size of contents.
– Default content store should be faster than others for
writing to avoid bottlenecks (contents come to default
then copied to other content store)
– WORM disks as non default content store (cleaner -
Jefferies)
– SAN if possible
– If NAS use a dedicated LAN if possible
– LVM if possible (scalability, snapshot)
– Clean trash bin often
– Delete “contentstore.deleted” often
20. Storage Best Practices
• Indexes (SOLR or Lucene)
– Dedicated disk local or SAN.
– Avoid NAS.
– Have at least 50-75% of space free (backup and
merge)
– Consider using different file system for Lucene
backup and Solr backup.
• Logs
– Set your logs directory in different file system as
Content Store and Indexes.
21. Backup and Recovery
• Recovery Time Objective: (RTO) The amount of time
that it takes to get your systems back online.
• Recovery Point Objective: (RPO)This is the last
consistent data transaction prior to the disaster. If you
had a disaster, how much data would be lost?
• The Disaster Recovery plan (DR) focuses on getting
your business back up and running after a major outage
• The Business Continuance plan (BCP) focuses on
keeping your business running DURING the disaster.
22. Backup and Recovery
• Alfresco Backup and Recovery Tool is
available:
– http://blyx.com/open-source-contributions/alfresco-
bart/
• Alfresco Backup and Recovery White
Paper:
– http://www.slideshare.net/toniblyx/alfresco-backup-
and-disaster-recovery-white-paper
23. Common Questions to SE?
• Best practices to storage.
– You got it
• NAS or SAN?
– SAN if possible! Or NAS backed by a SAN is common as well. NAS is not bad
but now you know why is different.
• Required space for DB, Indexes, Content Store?
– It depends of any case but DB and Indexes use to be a 20% of the Content Store
space (each).
• Do you have an Archiving solution?
– Alfresco can be integrated with Archiving solutions like mentioned above and
implemented with Content Store Selector.
• Do you have a backup/recovery solution?
– http://www.slideshare.net/toniblyx/alfresco-backup-and-disaster-recovery-white-
paper
• Do you have an data encryption solution?
– Yes, Alfresco Encryption at Rest:
http://docs.alfresco.com/5.0/concepts/encrypted-overview.html
24. What kind of storage can I use
with Alfresco?
• Any mountable volumes that can be made to
appear as standard local filesystems (local disks,
NAS, SAN, etc.)
• Amazon S3 (for Alfresco installations in AWS)
• Centera (through the now open source
connector)
• EMC Atmos (through a partner-created
integration)
• CAStor (through a dated partner-created
integration)
26. Deleting Content
• A complex process
• You need to know this because it impacts
– Disk space management
– Backup and recovery procedures (and their integrity)
– Security and auditing
• You have a wide degree of control over what happens
and when
• You need to do some work
• More info page 24
http://www.slideshare.net/toniblyx/alfresco-security-best-
practices-guide