HTML Injection Attacks: Impact and Mitigation Strategies
Drupal content-migration
1. Ashok Modi (BTMash) – Drupal LA – March 2011 Migrating content to drupal – Migrate Module
2. Agenda Different Methods Steps to Work with Migrate Hooks Build a Class Description Source Adding Mapping Additional Data drush commands Q & A
3. Disclaimer Code Heavy! You may fall asleep. New territory! Talk about one *small* aspect of migration (migrating nodes) Not talking about creating own DestinationHandlers Possibly not talking about creating own FieldHandlers (depends on time) Can walk through an example migration that I did if preferred. Ask questions? It will make the presentation less terrible
4. Possible method – by hand Should be accurate Get all files Mapping / everything works Time consuming Not feasible if you have a lot of content. Good way to test interns / punish coworkers (?)
5. Possible methods – Node Export Node Export (http://drupal.org/project/node_export) Has 7.x branch But no way to update content from 6.x -> 7.x No way to go back *easy* to set up (requires exact setup between source and destination in field names, etc)
6. Possible Methods - Feeds Really good method Map fields from source to destination Can import RSS / Atom / Various types of feeds Also flat files such as CSV Well documented Other than flat files, requires a feed source Might run into issues if content is updated in source *might be tricky in another cms*
7. Method demonstrated - Migrate Already defined many possible import sources XML, JSON, CSV, Databases (any currently supported by Drupal!) Can import many different types of content Users, Nodes, Comments, Taxonomy, Files … all core entities Can define your own import handler (not covered in presentation) Can define own method for importing custom fields Current already defined for all core field types Has support for Media Module importing Patch underway for getting date import Can define your own field handler (possibly not covered in presentation) Drush integration Rollback, Status Updates, Limited import. Caveat – Confusing documentation Only a status UI – all mapping must be done in code.
8. Assumptions made for presentation Migrating from a database Files directory for source are on same machine as the destination site directory
9. Steps to work with Migrate Let Migrate know about your module (1 hook!) Build a Migration Class Give it a description Let Migrate know what you’re getting content from. Let Migrate know about the type of content. Map the fields the migrate class it going to fill. (Optional) Massage / Add any fields you couldn’t get in the initial mapping (query).
10. Step 1: Hook Implement one hook – hook_migrate_api Provide the api version number (currently at version 2) That’s it! function mymodule_migrate_api() { return array( ‘api’ => 2, ); }
11. Step 2: Build a Class Implement classes Class defines type of content that will be imported class NodeContentTypeMigration extends Migration { public function __construct() { parent::__construct(); … } public function prepareRow($current_row) { … } }
12. Step 2: Build a Class (functions inside) public function __construct() {…} Constructor for the class Allows migrate to know content type (user, node, tags) Where content is mapped from (db, csv, xml, etc) All the mappings coming in (fields) (optional)public function prepareRow($current_row) {…} Any extra data (things that cannot be pulled in a single query(?)) Massage any of the data that was pulled in (clean up text, clean up links, etc)
13. Step 2a: Create a description Create a description Class Description Any additional source fields (not found in initial query) Initial source -> destination mapping (what is the key in the source db?) $this->description = t(“Import all nodes of type PAGE”); Define Source Fields Fields that may not be getting pulled in via your query or in the flat file data but will be getting migrated somehow $source_fields = array( 'nid' => t('The node ID of the page'), ’my_files' => t(’The set of files in a field for this node'), );
14. Off course: query source database Set up query (if need be, switch DBs using Database::getConnection) $query = Database::getConnection('for_migration', 'default'); Then write out rest of the query Alternatively, if source db is on same machine as destination db, use mysql db shortcut db_select(MY_MIGRATION_DATABASE_NAME .’.table_name’, ‘t’)
15. Step 2b: Call to grab data NOTE: This is only for migrations from databases Set up query (if need be, switch DBs using Database::getConnection) $query = db_select(MY_MIGRATION_DATABASE_NAME .'.node', 'n’) ->fields('n', array('nid', 'vid', 'type', 'language', 'title', 'uid', 'status', 'created', 'changed', 'comment', 'promote', 'moderate', 'sticky', 'tnid', 'translate')) ->condition('n.type', 'page', '='); $query->join(MY_MIGRATION_DATABASE_NAME .'.node_revisions', 'nr', 'n.vid = nr.vid'); $query->addField('nr', 'body'); $query->addField('nr', 'teaser'); $query->join(MY_MIGRATION_DATABASE_NAME .'.users', 'u', 'n.uid = u.uid'); $query->addField('u', 'name'); $query->orderBy('n.changed');
16. Step 2b: Why the orderby? Migrate module has a feature called ‘highwater’ It is a key to designate and figure out if a piece of content needs to be updated rather than inserted. Means content can be updated! $this->highwaterField = array( 'name' => 'changed', 'alias' => 'n’, );
17. Step 2c: Mappings Add a ‘mapping’ (this is for tracking relationships between the rows from the source db and the rows that will come in the destination site) – essentially key of source DB. $this->map = new MigrateSQLMap( $this->machineName, array( 'nid' => array( 'type' => 'int’, 'unsigned' => TRUE, 'not null' => TRUE, 'description' => 'D6 Unique Node ID’, 'alias' => 'n', ) ), MigrateDestinationNode::getKeySchema());
18. Step 2c: Mappings (cont’d) Now let the migrate module know what kind of mapping is being performed. $this->source = new MigrateSourceSQL($query, $source_fields); Along with the type of content $this->destination = new MigrateDestinationNode('page'); .
19. Step 2d: Map Fields Usually follows the form $this->addFieldMapping(‘destination_field_name’, ‘source_field_name’); $this->addFieldMapping('revision_uid', 'uid'); Can provide default values $this->addFieldMapping('pathauto_perform_alias')->defaultValue('1'); Can provide no value $this->addFieldMapping('path')->issueGroup(t('DNM')); Can provide arguments and separators for certain field types (body, file, etc require this methodology)
20. Step 3: Additional data / cleanup Optional public function prepareRow($current_row) Use it to add any additional data / cleanup any fields that were mapped in. // Get the correct uid based on username & set author id for node to uid $user_query = db_select('users', 'u’) ->fields('u', array('uid')) ->condition('u.name', $username, '='); $results = $user_query->execute(); foreach ($results as $row) { $current_row->uid = $current_row->revision_uid = $row->uid; break; }
21. Drush Commands (the important ones) drush ms – List various migration import classes drush mi <importclass> - Import content drush mr <importclass> - Rollback content Options --idlist=id1,id2,… - Import content with specific IDs --itemlimit=n – Only import up to ‘n’ items --feedback=“n seconds” – Show status report every ‘n’ seconds --feedback=“n items” – Show status report every ‘n’ items
22. Resources http://drupal.org/project/migrate http://drupal.org/node/415260 Look at the example modules http://drupal.org/project/migrate_extras http://drupal.org/project/wordpress_migrate http://cyrve.com/import (drush documentation) http://goo.gl/3e1Jm(additional documentation to be added to core project) http://goo.gl/2qDLh (another example module)