Changeset - 6ed439c835bd
[Not reviewed]
0 2 0
Lance Edgar - 5 years ago 2019-08-14 23:24:49
ledgar@techsupport.coop
Add some inter-doc links
2 files changed with 12 insertions and 5 deletions:
0 comments (0 inline, 0 general)
docs/api/rattail/importing/index.rst
Show inline comments
 

	
 
``rattail.importing``
 
=====================
 

	
 
.. automodule:: rattail.importing
 

	
 
There are a handful of importers and import handlers made available within this
 
namespace, i.e. not just the base classes but some variations thereof etc.  You
 
may import each of these directly from this namespace, e.g.::
 

	
 
   from rattail.importing import Importer
 

	
 
However it's more typical to do this instead::
 

	
 
   from rattail import importing
 

	
 
That way you can reference ``importing.Importer`` as well as
 
``importing.model.ProductImporter`` etc.  The full list of what's available in
 
this ``rattail.importing`` namespace follows.
 

	
 
Please also see :doc:`/narr/importers` for some of the general concepts
 
involved here.
 

	
 
Importers
 
---------
 

	
 
 * :class:`rattail.importing.importers.Importer`
 
 * :class:`rattail.importing.importers.FromQuery`
 
 * :class:`rattail.importing.importers.BulkImporter`
 
 * :class:`rattail.importing.sqlalchemy.FromSQLAlchemy`
 
 * :class:`rattail.importing.sqlalchemy.ToSQLAlchemy`
 
 * :class:`rattail.importing.postgresql.BulkToPostgreSQL`
 

	
 
Import Handlers
 
---------------
 

	
 
 * :class:`rattail.importing.handlers.ImportHandler`
 
 * :class:`rattail.importing.handlers.BulkImportHandler`
 
 * :class:`rattail.importing.handlers.FromSQLAlchemyHandler`
 
 * :class:`rattail.importing.handlers.ToSQLAlchemyHandler`
 
 * :class:`rattail.importing.rattail.FromRattailHandler`
 
 * :class:`rattail.importing.rattail.ToRattailHandler`
 

	
 
Rattail Model Importers
 
-----------------------
 

	
 
This is a little different but worth a mention.  If you do::
 

	
 
   from rattail import importing
 

	
 
then you will have access to the full set of Rattail data importers (i.e. the
 
"local" side of an import which targets a Rattail database) via
 
``importing.model`` - in other words you can then do this:
 

	
 
.. code-block:: python
 

	
 
   class ProductImporter(importing.model.ProductImporter):
 
       """ Custom product importer. """
 

	
 
Of course that's only helpful if you're importing data to Rattail.  See
 
:mod:`rattail.importing.model` for what's available in that namespace.
docs/narr/importers.rst
Show inline comments
 

	
 
Data Importers
 
==============
 

	
 
A frequent need when integrating systems is to import (or export, depending on
 
your perspective) data from one system to another.  Rattail provdes a
 
your perspective) data from one system to another.  Rattail provides a
 
framework for this, which offers the following benefits:
 

	
 
 * "dry run" mode to check things out before committing changes
 
 * "warnings" mode which sends email with data diffs, e.g. when you expect no changes
 
 * adjustable "batch size" for grouping changes when submitting to local system
 
 * full command line support for above, plus "max" changes to apply, show progress, etc.
 
 * core code is optimized to run quickly, e.g. by fetching all data up-front
 
 * new importers may be created simply / cleanly / according to existing patterns
 
 * new importers may extend / replace core functionality as needed
 

	
 
The rest of this document aims to explain the concepts and patterns involved
 
with the Rattail importer framework.
 

	
 
.. todo::
 
   Add link for code / API docs here.
 
See also the API docs for :mod:`rattail.importing` (and sub-modules).
 

	
 

	
 
"Importer" vs. "DataSync"
 
-------------------------
 

	
 
Perhaps the first thing to clear up, is that while Rattail also has a
 
"datasync" framework, tasked with keeping systems in sync in real-time, the
 
"importer" framework is tasked with a "full" sync of two systems.  In other
 
words "datasync" normally deals with only one (e.g. changed) "host" object at a
 
time, and will update the "local" system accordingly, whereas an "importer"
 
will examine *all* host objects and update local system accordingly.  Also,
 
datasync normally runs as a proper daemon, whereas an importer will normally
 
run either as a cron job or in response to user request via command line or web
 
UI, etc.
 

	
 
.. todo::
 
   Write datasync docs / link here.
 

	
 
To make things even more confusing, datasync can leverage an import handler /
 
importer(s) where possible so that the same logic is executed for both
 
"real-time sync" and "full sync" modes.
 

	
 

	
 
"Host" vs. "Local" Systems
 
--------------------------
 

	
 
From the framework's perspective, all import tasks have two "systems" involved:
 
one is dubbed "host" and refers to the *source* of external/new data; the other
 
is dubbed "local" and refers to the *target* where existing data is to be
 
changed.  It is important to understand what "host" and "local" refer to as you
 
will encounter those terms frequently in the documentation (and code).
 

	
 
Note that it is perfectly fine for the same "system" proper, to be used as both
 
host and local systems within a given importer.  Meaning, you can read some
 
data from one system, and then write data changes back to the same system.
 
This can be useful for applying business rules logic to "core" (e.g. customer)
 
records as an asynchronous process after they are changed normally within UI or
 
as part of EOD etc.  Typical use though of course is for the host and local
 
systems to be actually different systems.
 

	
 
The term "system" here doesn't imply a database or anything in particular,
 
really.  All that is required of a "host system" is that it be able to provide
 
data for the import; all required of a "local system" is that it be able to
 
provide "corresponding data" (i.e. for comparison, to determine if an
 
add/update/delete is needed) and/or be able to apply add/update/delete
 
operations as requested.  Therefore in practice either the "host" or "local"
 
systems may be a database, web API, Excel spreadsheet, flat text file, etc.
 

	
 
Also, the host -> local data flow is not always strictly the case, for instance
 
it sometimes is necessary to change the "host" system to reflect changes which
 
were made in the "local" system (e.g. mark a host record as exported).  The
 
typical scenario of course is for only the "local" system to be changed.
 

	
 
Since all importers have this "host -> local" pattern, on the code level it is
 
almost always the case that an importer will inherit from two base classes, one
 
for the host side and another for the local.  More on that later though.
 

	
 

	
 
"Importer" vs. "Import Handler"
 
-------------------------------
 

	
 
Another important distinction within the framework itself, is that of the
 
"importer" vs. "import handler".  Technically a single ``Importer`` contains
 
the logic for reading data from host, and reading/changing data on local
 
system, but specific to a single "data model" (e.g. products table) whereas an
 
``ImportHandler`` contains logic for the overall transaction
 
(i.e. commit/rollback).  Therefore a single *import handler* might "handle"
 
multiple *importers*, e.g. one for products, customers etc., so that multiple
 
data models might be updated within a single transaction.
 

	
 
Note however that even within these docs you will find the term "importer"
 
thrown around more often, sometimes in the generic sense meant only to refer to
 
the overall importer concept / framework / implementation.  Hopefully when the
 
distinction is important to be made within the docs, it will be.
 

	
 
Also note that in practice, the "handler" abstraction layer is not always
 
strictly necessary; for instance you might need an importer to push new
 
customer email addresses to an online mailing list, and it may have to use a
 
web API which only supports one add per call.  In other words you have only one
 
"data model" to update, so you don't need a handler to manage multiple
 
importers, and the web API doesn't support the commit/rollback approach because
 
each change submitted, is committed at once.  However the suggested approach is
 
to stick with established patterns and use a handler; various other parts of
 
the Rattail framework (command line, datasync) will expect one.
 

	
 

	
 
Making a new Importer
 
---------------------
 

	
 
Okay then, you must be serious if you made it this far...
 

	
 
First step of course will be to identify the "host" and "local" systems for
 
your particular scenario.  For the sake of a simple example here we'll assume
 
you wish to import product data from your "host" point of sale system (named
 
"MyPOS" within these docs) to your "local" Rattail system.
 

	
 
Note also that to make a new importer, you must have already started a project
 
based on Rattail; this doc will not explain that process.  The examples which
 
follow assume this project is named 'myapp'.
 

	
 
.. todo::
 
   Write docs for starting a new Rattail project / link here.
 
.. note::
 
   For now, we do have a wiki doc for `Creating a New Project`_.  Note that the
 
   wiki uses the name "Poser" to refer to the custom app, whereas the doc
 
   you're currently reading uses "myapp" for the same purpose.  Some day they
 
   both should use "Poser" though...
 

	
 
.. _Creating a New Project: https://rattailproject.org/moin/NewProject
 

	
 

	
 
File / Module Structure
 
^^^^^^^^^^^^^^^^^^^^^^^
 

	
 
With the host and local systems identified, you can now start writing
 
code...but where to put it?  Assuming you already have a Rattail-based project
 
with package named 'myapp' and assuming you were adding a POS->Rattail
 
importer, the suggestion would be to add the following files to your project:
 

	
 
.. code-block:: none
 

	
 
   myapp/
 
      __init__.py
 
      importing/
 
         __init__.py
 
         model.py
 
         mypos.py
 

	
 
This is just a suggestion really, although it is the author's personal
 
convention which has served him well.  Another typical scenario might be where
 
you wish to "export" data from Rattail->POS, in which case you might do
 
something like this instead:
 

	
 
.. code-block:: none
 

	
 
   myapp/
 
      __init__.py
 
      mypos/
 
         __init__.py
 
         importing/
 
            __init__.py
 
            model.py
 
            rattail.py
 

	
 
The difference may be subtle, but the intended effect is for the ``model.py``
 
file to contain logic which targets the "local" side of the importer, while the
 
"other" file (e.g. ``mypos.py`` in the first example, ``rattail.py`` in the
 
second) would contain logic for the "host" side of the importer.  This "other"
 
file is also where the import *handler* would live, since ultimately both sides
 
must be known for an importer to function.
 

	
 
The main advantage to this layout / structure is that a given ``model.py``
 
might be shared among various importers.  For example
 
``rattail.importing.model`` defines all the natively-supported importer logic
 
when targeting various Rattail data models on the local side.  (So technically
 
if you didn't need to override any of that, you wouldn't need to provide your
 
own ``model.py`` in the POS->Rattail scenario.)
 

	
 
Note that in practice the ``__init__.py`` file for an ``importing`` package
 
typically has (only) the following contents, for convenience:
 

	
 
.. code-block:: python
 

	
 
   from . import model
 

	
 

	
 

	
 
Define Import Handler
 
^^^^^^^^^^^^^^^^^^^^^
 

	
 
For the sake of a single example we'll continue to assume a POS->Rattail import
 
is desired.  Given the above file structure, that means the file
 
``myapp/importing/mypos.py`` will contain the handler.  Within that file you'll
 
need to add something like the following:
 

	
 
.. code-block:: python
 

	
 
   from rattail import importing
 
   from rattail.gpc import GPC
 

	
 
   from myapp.mypos.db import Session as MyPosSession, model as mypos
 

	
 

	
 
   class FromPosToRattail(importing.FromSQLAlchemyHandler, importing.ToRattailHandler):
 
       """
 
       Handler for MyPOS -> Rattail import.
 
       """
 
       host_title = "MyPOS"
 
       local_title = "Rattail"
 

	
 
       def make_host_session(self):
 
           return MyPosSession()
 

	
 
       def get_importers(self):
 
           return {
 
               'Department':    DepartmentImporter,
 
               'Vendor':        VendorImporter,
 
               'Product'        ProductImporter,
 
           }
 

	
 
Note that the importers (dept/vend/prod) don't exist yet; those will be defined
 
next, within this same file.  Also here you can again see the strong "host ->
 
local" patterns within the handler.
 

	
 
Choosing the correct base class(es) will be important.  Here, by inheriting
0 comments (0 inline, 0 general)