This days I wanted to profile the paroli application, there are some issues with the scrolling and loading time. So I first took the loading time issue and tried to solve it. Right now Enlightenment takes around 7 seconds to fully load and Paroli takes around 20 seconds. As E17 loads Paroli as a startup application before loading the modules, the time GTA02 takes to show up Paroli since the moment the X server has started is around 30 seconds.
The current E17 config we have is a default one, we basically load every module for E17 even if it is not used for Paroli. After tweaking the configuration and making illume's module to be loaded in an async way, the loading of E17 now takes half of the time, only 3.5 seconds (The new config is already committed). Sadly Paroli itself still takes too much time.
Note that all the numbers above are on a NFS system, but the relative slowness is the same for a flash filesystem.
Tuesday, March 10, 2009
Tuesday, March 3, 2009
Glamo Xrender Benchmark with Expedite
Yesterday I've been testing the xrender engine on evas using the current EXA acceleration found in glamo (that is: solid fills and surface blitting). Sadly the test was taking ages to finish and even after walking up and leaving it the whole night it didnt finish but hang on the text test.
So, i wanted to test just the glue found on XRender and the implementation of it using EXA, but without painting anything, just memory moves from system memory to VRAM and the neccesary logic found on the Evas' Xrender engine. So I "removed" (just return TRUE) the functions from the xf86-video-glamo driver and .... here are the results:
The results are very disappointing, there are several places where drawing on software is better than just doing the logic on XRender/EXA to achieve the same result but without drawing. And in the tests where XRender/EXA is better the speed up doesn't worth as the drawing will be for sure slower. Note that the Glamo chip can only do raster operations into a destination surface of format RGB565, which means that there wont be any acceleration even if the blending is possible on hardware as Evas uses ARGB8888 premul.
Then, how to improve the speed of the rendering on Evas specifically for this chip? The path through XRender/EXA is worthless, is there any other way? Well. one possibility we could use, is to use the Evas' software_16 engine (a destination surface of format RGB565) to reduce the bandwidth needed, but how to match that with the XRender API?
Another solution could be to leave the efforts on xf86-video-glamo acceleration and just build a specific Evas engine for glamo. Mmap the whole framebuffer memory and manage it through Eina's memory pool manager, handle the surfaces ourselves and do a mix between software_16 and this specific engine. A lot of work, yes, but looks like the only solution (X away) that can give us some results. But there's a problem, how to send the changes into the displayed X window? because in our engine we'll use a VRAM backbuffer and we can't know from a X client the phyisical memory of the area the window is being drawn. So we'll have a roundtrip here, physical memory (our glamo surface) -> virtual memory (Xshm/X memory) -> physical memory (destination framebuffer), that for sure will remove any speedup.
Suggestions?
So, i wanted to test just the glue found on XRender and the implementation of it using EXA, but without painting anything, just memory moves from system memory to VRAM and the neccesary logic found on the Evas' Xrender engine. So I "removed" (just return TRUE) the functions from the xf86-video-glamo driver and .... here are the results:
| Benchmark | Software X11 | XRender without painting |
| Image Blend Unscaled | 2.76 | ???? |
| Image Blend Solid Unscaled | 12.69 | 13.72 |
| Image Blend Nearest Scaled | 1.56 | 18.14 |
| Image Blend Nearest Solid Scaled | 8.77 | 18.00 |
| Image Blend Smooth Scaled | 0.45 | 18.22 |
| Image Blend Smooth Solid Scaled | 5.93 | 17.59 |
| Image Blend Nearest Same Scaled | 5.02 | 21.26 |
| Image Blend Nearest Solid Same Scaled | 22.05 | 17.73 |
| Image Blend Smooth Same Scaled | 1.27 | 20.96 |
| Image Blend Smooth Solid Same Scaled | 11.84 | 17.76 |
| Image Blend Border | 0.51 | 1.83 |
| Image Blend Solid Border | 6.67 | 1.97 |
| Image Blend Border Recolor | 0.44 | 1.23 |
| Image Quality Scale | 4.29 | 1.97 |
| Image Data ARGB | 7.22 | 3.71 |
| Image Data ARGB Alpha | 4.89 | 1.70 |
| Image Data YCbCr 601 Pointer List | 6.54 | 3.16 |
| Image Data YCbCr 601 Pointer List Wide Stride | 6.04 | 5.40 |
| Image Crossfade | 6.67 | 4.61 |
| Text Basic | 9.28 | 2.25 |
| Text Styles | 1.05 | 0.17 |
| Text Styles Different Strings | 0.79 | 0.14 |
| Text Change | 5.64 | 1.86 |
| Textblock Basic | 5.67 | 1.50 |
| Textblock Intl | 4.67 | 2.46 |
| Rect Blend | 1.81 | 9.66 |
| Rect Solid | 9.57 | 18.02 |
| Rect Blend Few | 69.84 | ????? |
| Rect Solid Few | 84.22 | 61.79 |
| Image Blend Occlude 1 Few | 41.09 | 196.75 |
| Image Blend Occlude 2 Few | 24.00 | 47.37 |
| Image Blend Occlude 3 Few | 17.50 | 70.32 |
| Image Blend Occlude 1 | 43.26 | 26.20 |
| Image Blend Occlude 2 | 14.59 | 14.03 |
| Image Blend Occlude 3 | 4.87 | 21.06 |
| Image Blend Occlude 1 Many | 27.31 | 12.14 |
| Image Blend Occlude 2 Many | 6.81 | 4.61 |
| Image Blend Occlude 3 Many | 2.21 | ???? |
| Image Blend Occlude 1 Very Many | 3.79 | 1.54 |
| Image Blend Occlude 2 Very Many | 0.66 | 0.43 |
| Image Blend Occlude 3 Very Many | 0.36 | 0.58 |
| Polygon Blend | 3.51 | 1.69 |
| EVAS SPEED | 11.86 | 18.66 |
The results are very disappointing, there are several places where drawing on software is better than just doing the logic on XRender/EXA to achieve the same result but without drawing. And in the tests where XRender/EXA is better the speed up doesn't worth as the drawing will be for sure slower. Note that the Glamo chip can only do raster operations into a destination surface of format RGB565, which means that there wont be any acceleration even if the blending is possible on hardware as Evas uses ARGB8888 premul.
Then, how to improve the speed of the rendering on Evas specifically for this chip? The path through XRender/EXA is worthless, is there any other way? Well. one possibility we could use, is to use the Evas' software_16 engine (a destination surface of format RGB565) to reduce the bandwidth needed, but how to match that with the XRender API?
Another solution could be to leave the efforts on xf86-video-glamo acceleration and just build a specific Evas engine for glamo. Mmap the whole framebuffer memory and manage it through Eina's memory pool manager, handle the surfaces ourselves and do a mix between software_16 and this specific engine. A lot of work, yes, but looks like the only solution (X away) that can give us some results. But there's a problem, how to send the changes into the displayed X window? because in our engine we'll use a VRAM backbuffer and we can't know from a X client the phyisical memory of the area the window is being drawn. So we'll have a roundtrip here, physical memory (our glamo surface) -> virtual memory (Xshm/X memory) -> physical memory (destination framebuffer), that for sure will remove any speedup.
Suggestions?
Sunday, March 1, 2009
Initial Benchmark of the xf86-video-glamo on GTA02
After my tremendous problems building my build environment I have finally succeed :)
So I had the chance to give xorg-video-glamo a try and see how well it behaves. To do the benchmark I used expedite and the results are:
The benchmark took around half an hour to end and from the final EVAS SPEED value, it is really, really slow.
Note that right now the driver is just a wrapper on top of the fbdev, so no acceleration is coded yet, only software based rendering and giving that the CPU isn't that fast either, there's no surprise on the benchmark.
The Xrender acceleration is one of the possibilities to improve the performance and the good news is that Evas already provides a Xrender based engine. So let's get the hands dirty and start hacking the driver! :)
So I had the chance to give xorg-video-glamo a try and see how well it behaves. To do the benchmark I used expedite and the results are:
2.76 , Image Blend Unscaled
12.69 , Image Blend Solid Unscaled
1.56 , Image Blend Nearest Scaled
8.77 , Image Blend Nearest Solid Scaled
0.45 , Image Blend Smooth Scaled
5.93 , Image Blend Smooth Solid Scaled
5.02 , Image Blend Nearest Same Scaled
22.05 , Image Blend Nearest Solid Same Scaled
1.27 , Image Blend Smooth Same Scaled
11.84 , Image Blend Smooth Solid Same Scaled
0.51 , Image Blend Border
6.67 , Image Blend Solid Border
0.44 , Image Blend Border Recolor
4.29 , Image Quality Scale
7.22 , Image Data ARGB
4.89 , Image Data ARGB Alpha
6.54 , Image Data YCbCr 601 Pointer List
6.04 , Image Data YCbCr 601 Pointer List Wide Stride
6.67 , Image Crossfade
9.28 , Text Basic
1.05 , Text Styles
0.79 , Text Styles Different Strings
5.64 , Text Change
5.67 , Textblock Basic
4.67 , Textblock Intl
1.81 , Rect Blend
9.57 , Rect Solid
69.84 , Rect Blend Few
84.22 , Rect Solid Few
41.09 , Image Blend Occlude 1 Few
24.00 , Image Blend Occlude 2 Few
17.50 , Image Blend Occlude 3 Few
43.26 , Image Blend Occlude 1
14.59 , Image Blend Occlude 2
4.87 , Image Blend Occlude 3
27.31 , Image Blend Occlude 1 Many
6.81 , Image Blend Occlude 2 Many
2.21 , Image Blend Occlude 3 Many
3.79 , Image Blend Occlude 1 Very Many
0.66 , Image Blend Occlude 2 Very Many
0.36 , Image Blend Occlude 3 Very Many
3.51 , Polygon Blend
11.86 , EVAS SPEED
The benchmark took around half an hour to end and from the final EVAS SPEED value, it is really, really slow.
Note that right now the driver is just a wrapper on top of the fbdev, so no acceleration is coded yet, only software based rendering and giving that the CPU isn't that fast either, there's no surprise on the benchmark.
The Xrender acceleration is one of the possibilities to improve the performance and the good news is that Evas already provides a Xrender based engine. So let's get the hands dirty and start hacking the driver! :)
Subscribe to:
Posts (Atom)