GUI

GUI Layers and features in traditional Window Icon Mouse Pointer (WIMP) interfaces

Font, characters, text and spacing:
- Subpixel hinting
- Colored fonts (used e.g. for colored emojis)
- Fonts with ligatures
- kerning
- line-spacing (a lot of fonts have a variable height)
- A font can have parts of the character outside of the box used to position the characters, e.g. the arm of a capital L may continue under the following letters in Loki Cola
- Font formats:
- bitmap vs. vector.
- Format names: otf, metafont, ttf, …
- Available features differ in formats, e.g. color, kinds of curves (degree of bézier curves)
- Variable-width fonts
- Flexible spacing (glue in TeX)
- Hyphenation
- Knuth algorithm for text reflow and hyphenation (used in TeX)
- Font size, dpi and zoom level (font size is not just a zoom because the font weight should vary slightly based on the on-screen size)
- On-screen size of a text is hard to determine in advance, i.e. a call to size(text, zoomlevel * fontsize) is needed.
- In order to avoid reflow when zooming, the font and text layout must avoid changing the width of a string when the zoom factor changes.
- HiDPI displays
- Decomposition of characters:
- an individual character in the unicode list of characters, e.g. "e"
- Code point = the number / identifier assigned to a glyph by the Unicode standard, e.g. U+0065 for "e"
- Grapheme = sequence of code points that are combined together to visually form a single entity, e.g. U+0065 (latin small letter E) combines with U+0301 (combining acute accent) to give the grapheme "é" (this grapheme also has a legacy encoding as a single code point, but not all combinations have an equivalent code point). source: https://stackoverflow.com/a/27331885
  - Unicode combining diacritics and joiners are used for this (used for accentuated characters and emojis)
- Glyph = an image, stored in a font, for a given character
- Subscript, superscript, sub-subscripts, symbols like \sum with scripts below and above the symbol
- Left-To-Right and Right-To-Left, boustrophedon (alternating LTR and RTL), Vertical and horizontal directions.
- These also affect the way selection and cursor navigation works (the current status in some widely-used GUIs is very bad when RTL and LTR are mixed).
- Ambiguity of displayed Unicode characters
- "Inline string" data type should contain a mix of inline elements.
- The gist is that:
  - an "inline string" is a sequence of character-like entities
  - it is displayed with carriage returns only when the end of the line is reached
  - it should be able to read an "inline string" aloud, where the few graphical elements are described when reading
  - if it would make sense to have a ligature between two characters, they should be in the same inline string when sent to display (the opposite is not true, two characters can be in the same inline string but displayed without a ligature e.g. because they are written in different languages)
- characters in unicode or other encodings
- emojis and mathematical symbols that do not have a codepoint yet
- spark graphs (small inline line graph of the variation of some quantity)
- small inline directed or undirected graphs
- other small pictures
- anchors (to mark special points in the text e.g. to mark the destination of an arrow)
- subelements (to mark the start and end of hyperlinks, bold, highlight, annotations, tooltips, languages and so on).
- An "inline string" is layed out as a flow (see Layout below)
- An "inline string" can be processed as a stream of character-like entities + delimiters by doing an inorder traversal of the tree of subelements.
- JetBrains MPS and GNU Emacs use something similar to describe GUIs to display ASTs
- There should be a way to display the text cursor (e.g. vertical bar) without moving the rest of the text
Icons
- vectorial
- low-resolution vectorial (Haiku OS uses a special vectorial image format for low-resolution icons).
- bitmap
- support for alternative icon themes
- composition of icons (emblems on top of file icons to indicate important, read-only, shortcut etc.)
Selection
- The Humane Interface suggests what seems like the best approach: a cursor is both an insertion point marked with a vertical bar with a small curve towards the next character (l-shaped bar) and a selection marked with a grey background.
- When inserting a character, e.g. "a", it is inserted on the side of the vertical bar indicated by its small curve.
- When deleting or copying, the selection (grey part) is used.
- This way it is clear what both kinds of operations do.
- In The Humane Interface there is a chapter on selection, with plenty of good ideas, e.g. clicking in the middle of a character selects that character and puts the insertion cursor to the right of that character. That way, the user does not need to try to aim the small space between two characters. Most guis put the cursor on the left (or right) of the character for clicks in its left (right respectively) half, but it is difficult to learn this fact (because the area of half a character is so small, it is not evident that the effect was linked to the position of the input click within the character).
- Multiple selections, multiple insertion cursors
- Click: query a quadtree, kd-tree or object ID bitmap to know in which text field and on which
- Click+drag: query widget and character for click and each subsequent mouse position, and highlight the elements inbetween. Needs an order on the elements. This order is the natural one for sequences of inline widgets. When the mouse cursor is not on a selectable object (outside of the region containing the sequence of inline elements, or over a non-selectable part of that region) it is necessary to find the closest element that is contiguously selectable with respect to the initial click. In general this cannot be done with an object bitmap (as the answer depends on where the first click occurred), and a more clever data structure is needed.
- On X.org selecting text copies it to the selection clipboard. An explicit keyboard shortcut or context menu action copies to the separate Ctrl+C Ctrl+V keyboard. Some applications or GUI toolkits accidentally overwrite the selection clipboard when focus changes, care should be taken to only set the selection keyboard after an explicit selection with the mouse (drag) or keyboard (shift+motion keys).
Human input devices
- Pointer input
- Mouse is relative
  - Mouse motion is usually integrated to obtain an absolute position on-screen.
  - In games mouse motion is used as a relative measure to indicate rotation of the camera.
- Touchscreen, tablet and eye tracking are absolute
- Keyboard input
- input method (XIM, UIM, SCIM, IBus …)
- Android keyboard and on-screen keyboards
- n-key rollover
  - capacity to detect many keys pressed at the same time.
  - many keyboards handle modifier keys specially, this means that modifiers cannot be mapped elsewhere on the keyboard, lest some key combinations like Shift+Ctrl+C might not be detected correctly.
  - The Fn modifier key on laptops is often handled directly by the firmware, and does not send an actual keycode. This prevents using that key as non-modifier key or as a modifier in combinations which are not explicitly supported by the firmware Some laptops (e.g. some ThinkPad laptops) send an actual scancode for that key.
- multimedia keys (volume down, calculator, suspend computer, …)
- scancodes (sent directly by keyboard for down/up events)
- keyboard layout: mapping from scancodes to keysyms (default layout, switch layout with a key combination of GUI widget, customization of the layout with e.g. Xmodmap)
- keyboard macros (often implemented in hardware because of the lack of good software support)
  - XCompose is a sort of keyboard macro mechanism: the user inputs a sequence of keys, which are translated to one (or several) Unicode symbol(s)
- application has access to scancodes, mapped keysyms and the output of XCompose macros. Some GUI toolkits do not provide access to all information.
- modifier keys can act as regular keys (e.g. super is both a modifier and a key that opens the application menu)
- multiple input devices connected to the same machine
- keyboard
- separate numeric pad
- pedals
- power management buttons (on/off/sleep button)
- USB devices can act as keyboards (e.g. iPod with RockBox simulates volume and previous/next keys when the buttons on the device are pressed).
- multiseat: multiple input and output devices used by several users at the same time. Input devices are dispatched to the session of the corresponding user.
Focus:
- next / previous at several levels (left/right vs. alt+tab are extremes)
- in/out for zoomable and hierarchical interfaces.
- notify system when the focus changes (for accessibility etc.)
- receive from the system the order to place the focus somewhere (including inside a menu), for "show me" in help files, tutorials etc.
Semantic structure of the interface
- The semantic structure may not match the layout structure (e.g. a sequence of buttons may be displayed as several separate horizontal containers).
- Tab order (order in which the elements are cycled through via the Tab key)
- It should be possible to read all elements using a screen reader
- It should be possible to place the focus on all actionable elements, including a way to copy read-only text that does not normally receive focus.
- To this effect the the interface can be linearized, or better structured in a hierarchical form (to avoid the need to focus many elements before the desired one).
Report position of widgets
- to help black-box GUI testing
- so that help files and tutorials can show where an element is
- so that accessibility options can highlight the element under focus
- needed internally to display tooltips
Layout:
- Flow
- inline (like <span> in HTML)
- other (relative positioning of blocks)
- connected text zones (text can start filling one zone and then continue to spill in the next one)
- table-like alignment (comments aligned on the right, similar function calls formatted like a table, labels and input fields, …)
- other alignment of widgets, macOS uses the Cassowary Constraint Solving Toolkit to solve alignment constraints.
Display and update overlapping widgets
- double-buffering
- compositing (the compositor takes the individual buffers of the windows, and displays them with occlusion, alpha blending etc.), see https://forum.osdev.org/viewtopic.php?p=226482#p226482 for a short overview
- dirty rectangles http://web.archive.org/web/20180630144602/http://www.trackze.ro/page/2(window manager tutorial)
- In zoomable user interfaces, overlap is less a problem but instead it is necessary to draw things with varying levels of detail (see LOD below) and at a high framerate
Network transparency
- In X.org, applications send streams of commands (including OpenGL commands, see notes on compositing) to the display server. This model allows the program and the server to be on different machines.
- Other systems require detection of updated parts of the screen (dirty rectangles) and sending the pixels in those portions of the image to the display machine.
- HTML pushes the toolkit on the client-side, while the server-side may perform the computation.
UI construction and update paradigm
- Reactive
- Functinal Reactive Programming
- Object Oriented
- …
Accessibility
- Features
- color inversion (can be done by compositing)
- constrained color palettes, e.g. high contrast or inverse high contrast (done by toolkit (theme) or compositing)
- edge detection on images
- alternative icons: simpler symbolic icons, text instead of icons (done by toolkit)
- zoom (done by compositing usually for high amounts of zoom, but increasing font size for the document and increasing the size of toolbars, menus and other widgets should be handled by the application, to support HiDPI screens)
- screen readers
  - voice fonts
  - explicit mention of significant parentheses, submenus, links and other visual clues when they are necessary to a correct understanding of the interface.
  - linearized or hierarchical interface
- dim less relevant parts of the screen (e.g. background windows) for ADD
- braille terminals
- keyboard-only navigation for everything except freehand drawing
- mouse-only navigation for everything except text input (the system usually provides a virtual keyboard)
- eeg-only or eye-tracking-only or voice-input-only navigation for paralized people and hands-free usage
- stdin/stdout or API so that the program may be controlled by another program or via an SSH session.
- Frameworks / APIs
- Wai-ARIA on the web
- Orca for GNOME
- If the GUI toolkit is well designed (semantic description instead of ad-hoc mutation of widget properties and non-semantic drawing on canvases), then accessibility features should be easy to implement.
From game dev
- Level of Detail (LOD)
- Z-buffer: during compositing, to detect wheter a pixel should be updated or if it is hidden by another GUI element
- Object ID buffer or quadree / kd-tree (to assign clicks)
- Raytracing
List of common shortcuts
- Not all are wise choices, and some are overloaded, but differences from the norm should have a reason
- Ctrl+PgUp previous (left) tab (onglet)
- Ctrl+PgDown next (right) tab (onglet)
- Tab or Ctrl+Space autocomplete (various choices for whether to apply autocomplete on Enter, Tab, Right or other)
- Tab next field
- Shift+Tab previous field
- Shift+Arrow move cursor and add to selection at the same time (usually resets the selection if the previous command was not a Shift+Motion key)
- Ctrl+Left, Ctrl+Right move cursor by word
- Ctrl+Up, Ctrl+Down move cursor by paragraph
- Shift+Ctrl+Arrow same as Ctrl+Arrow but also add to selection everything between the previous and new cursor positions. Usually resets the selection if the previous command was not a Shift+Motion key
- Shift+Click add to selection (all elements between last one selected and the one just clicked).
- If already selected, usually remove.
- If the range contains a mix of already selected and unselected elements, unsure (probably add all, could be toggle)
- Ctrl+Click add to selection (just this element) / remove if already added
- Shift+Click+Drag: add to / remove from selection (may also add all the elements between the last one selected and the one clicked)
- Ctrl+Click+Drag: add to / remove from selection
- Click: set focus and position text cursor
- Click+Drag: select and position text cursor
- F2 Help
- F2 Rename
- Ctrl+C copy
- Ctrl+X cut
- Ctrl+V paste
- Ctrl+S save
- Ctrl+W close document (in the current tab (onglet) or window)
- Ctrl+F4 close document (in the current tab)
- Alt+F1 open start menu, handled by the window manager
- Alt+F2 execute (usually handled by the window manager)
- Alt+F4 close window (all documents in current window), handled by the window manager
- F5 Run program (IDEs) or refresh page (browsers)
- Ctrl+A select all
- Ctrl+Z undo
- Ctrl+T open new file (possibly in a tab (onglet))
- Ctrl+T open new tab (onglet)
- Ctrl+Tab, Ctrl+Shift+Tab next/previous (sub-)window within the application or next/previous tab (onglet), sometimes handled by the window manager to cycle between windows, can use a stack (Alt+Tab, release, Alt+Tab is a no-op) or ring model (usually annoying to use)
- Alt+Tab, handled by the window manager, can use a stack (better) or ring (annoying) model
- Ctrl+I Ctrl+U Ctrl+B Italics, Underline, Bold
- Alt+left-click+drag on X.org move window, often handled by window manager
- Alt+middle-click+drag or Alt+right-click+drag on X.org resize window, often handled by window manager, the quadrant or octant clicked indicates which corner or edge of the window should be moved
- Enter validate
- Escape cancel, discard input and close, for dialog boxes and prompts
- RightClick context menu
- Long press on touch screen: context menu or select or move
List of common GUI features, widgets and conventions
- Tool tips
- Status bar
- Drag and drop
- Standard widgets with a well-understood meaning: buttons, checkboxes, radio buttons, toggle buttons (usually poor UI), separators etc.
- Scroll:
- scroll bar
- smooth scroll
- infinite scroll
- zoom on scroll bar (e.g. for video editing or playing, to select a precise second in a 2-hour progress bar)
- Toolbars, menu bars, "hamburger" menus
- _Access key (in a menu entry or for a button: a key which will triger an entry is indicated by underlining that character somewhere in the entry's name)
- Shortcuts on the right of menus
- Blue background indicates the currently focused selection.
- Other selections are grey.
- A black text cursor indicates the currently focused cursor.
- Unfocused text cursors may be grey or invisible.
- Dotted line on the edge of a button or hyperlink indicates it is focused.
- Focused tab is highlighted (no line separating it from the contents, blue background, …)
- bold label indicates default button (often OK)
- Feedback for actions:
- when clicked a button gets depressed
- when clicked a link changes color
- action in progress indicated by mouse cursor change, progress bar or spinner
Integration with the rest of the system
- File browser integration, e.g. TortoiseSVN and TortoiseGIT
- Progress bar in application icon (shown in task bar and Alt+Tab)
- Updatable application icon (shown in task bar and Alt+Tab)
- Notifications on application icon, e.g. in a colored bubble the number of undread messages or an exclamation mark (shown in task bar and Alt+Tab)
- Notification bubbles with buttons
- Updatable systray icon with menu, arbitrary systray widget
- Detect loss of focus on the window, detect window minimization
- Detect window resize
- Custom title bar (firefox and chromium merge tabs into the title bar)
- Borderless, fullscreen
- Clipboard: two clipboards in X.org, the usual Ctrl+C Ctrl+V clipboard may contain text, files, bitmap pictures and other applictaion-specific data. Often this data can be converted to a string as a fallback.
GUI features that are useful or common on a given platform
- Customizable toolbars: Right click to add/remove buttons, create new toolbars, submenus and so on. Microsoft Word LibreOffice Apache OpenOffice
- Machine-navigatable menus:
- Help files can open menus and show features (Old Windows apps)
- Allows for accessibility options (Orca on Gnome)
- macOS: the system provides a search box to find an entry in the (sub-)menus and configuration dialogs of the application.
- Automated GUI actions (Desktop COmmunication Protocol (DCOP) on K Desktop Environment (KDE), Desktop Bus (D-Bus)?, …)
TODO: add reference to ergonomy survey
TODO: capabilities based on the connection (e.g. remote users cannot shut down the machine)

In this folder

Layout engine

Zoomable user interface