Tuesday, December 13, 2011

Input Pipelines and Sensors Flow Soup

Fig 1. Input Layer Abstraction

By convention, Input devices are the peripherals which detect a human user input. And sensors are those which detect the environmental variation, related or not to the human.
Despite the existence of other definitions, both Input devices and sensors are needed to provide rich interaction with the user and allowing him to accomplish the tasks he want in less time.

Input devices can be the keyboard and the mouse and almost every application supports them but the definition also include touch and multitouch screens, touchpad, joysticks, etc.
From a wider point of view, keyboard and mouse seems to be two constant: K1, K2. They are modeled in operating systems using Finite State Machines and they are routed to the application without too much alteration of their original content. (Except the transformation of raw codes to ascii or unicode ones for example)

The "Why"
Operating systems, and I'll take Linux as an example, have abstraction layers to handle the input. Every peripherals generating key clicks is seen as a keyboard. For touchpads, mice, touchscreens, pen, etc they are attached to a virtual pointer device, and are handled as a conventional mouse.
Device drivers may generate other information (finger blob size, blob orientation, etc.) but they are all eliminated and don't reach the application.

Last years, we have faced the emergence of an 1.unlimited number of input devices and sensors, each of them provide very different information. And in the other side, application, exposing a lot of features, or which have more dimensional variables than ones found in a single input device (3d object control in 3d environment).

With these 2 poles, the problem becomes obvious:
Why is the operating system preventing us from using all devices capabilities inside our rich applications ?

Ok, let's fix this and answer the question with another :
Do you have another solution for mapping input events to application features ?

Mapping Input events
Before speaking about mapping, we should study the input itself even taking few examples.
Touchpads have their input considered as a compatible form of mouse input. But their brute information comes very noisy. When using direct input your pointer moves in a similar way of a Parkinson disease patient. The developers of have added a "filter" which smooths the movement (named "Response-augmented Exponentially Weighted Moving Average Filter").
Speaking about mouse input, the add also another "filter" that alter the smoothed input to accelerate its movement, so you a user don't need to slide a lot of times to move the pointer in bigger screens.
And let's imagine that this flow represented in Fig 2. finishes as an input to control the camera in a 3D scene:

Fig 2. Input Flow to control a 3D scene

From the last representation, the input system can be seen as a flow and in each point, the input got transformed before being routed to the application. The 3D scene itself takes that input and maps it internally to control the camera view.
The problem in this case is that, what if you want that your management meets the standard and won't be considered as a hack? You can't avoid the virtual pointer, and you can't avoid this predefined set of filters. You can't also select the manner of which the input can be mapped inside the application ( mouse(x,y) → scene(x,y) or → scene(y,z) etc.)
Another problem is that, some filters are inside the input device driver, other are inside the Input layer, and the rest belongs to the application. And all of them are hardly written and can't be changed or remapped without recompilation of the full stack.

Any time we have a new device or new events to support in the system, we keep asking this question:
How we will fit the new device input inside this flow with minimum effort and the least loss of information ?  

Rethinking All the Stack
In 2009, I have discovered all of this and I have found some efforts to simplify input management. Just having the ability to imagine a possible change is a lot of bravery... For a system developer who only use a keyboard, why he should think about rewriting everything ? That's a lot of work man !

From the last figures, we can imagine that filters should be extracted outside of device drivers, and from any predefined flow in a system, and we can play with them to modify and transform raw input to any useful form.
The flow becomes a general graph, where a sensor input can control how much a filter transforms a device stream to another. For real-time and other needs we may add also a universal timing to master how much each filter takes time for his computation.

Fig 3. Input Flow as represented by StreamInput Khronos Model.

What about applications ?
Legacy applications takes only two sort of input : Keyboard and Mouse. These two inputs still very limited to what is possible to do. But We have already have filters which route information through them including keyboard mapping transformers (possibility to write in Arabic/French/Chinese with the same keyboard), mutlitouch injectors in legacy apps like Ginn (included in ubuntu), or whatever else.

But wait, shouldn't we rethink the application itself ? Why they don't expose their functionalities through a Software Bus, where we have the freedom to connect filters' input to the specific action to be performed?
Fig 4. Rethinking application input

By doing this the biggest work will become the mapping. And we will need to search best ways to do it to perform user goals in better ways.

Scientific background ?
A lot of scientific research have studied these problems. But each of them tried to fix either the input flow, or the best mapping. But not all the cycle from and to the user.

For the Input configuration you can see the work of Pierre Dragicevic : iCon.
Another interesting thesis is the one by Rami Ajaj (PhD in French)
Theoretical and architectural support for input device adaptation paper.

A lot of other notable exist but I can't cite it all. 

And for a deep study of mapping and to understand the need of filters, it is mandatory to study what is an input device, a sensor, do some morphological analysis of their design space, study actual standards implemented in operating systems and describing a large spectrum of devices and usages

(I have skipped many research areas related to this subject to keep this post clear and simple.)


bradai.fatma said...

Great job, thanks for information and for the way of being clear and simple :)

Unknown said...

Thank you for this very clear and nicely-illustrated explanation of the problem of mapping heterogeneous input to application functionality. Certainly a lot remains to be done in this domain and I'm looking forward to seeing further posts on this or related topics.