#38 - Implement VT-d large pages so we can avoid sharing between EPT

Owner: George Dunlap <George.Dunlap@eu.citrix.com>

Date: Mon Feb 10 16:30:02 2014

Last Update: Thu Feb 13 16:45:02 2014

Severity: normal

Affects:

State: Open

[ Retrieve as mbox ]


From: Yang Zhang <yang.z.zhang@intel.com>
To: xen-devel@lists.xen.org
Cc: andrew.cooper3@citrix.com, tim@xen.org, Yang Zhang <yang.z.zhang@Intel.com>, xiantao.zhang@intel.com, JBeulich@suse.com
Subject: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 10 Feb 2014 14:14:00 +0800
Message-ID: <1392012840-22555-1-git-send-email-yang.z.zhang@intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

From: Yang Zhang <yang.z.zhang@Intel.com>

When enabling log dirty mode, it sets all guest's memory to readonly.
And in HAP enabled domain, it modifies all EPT entries to clear write bit
to make sure it is readonly. This will cause problem if VT-d shares page
table with EPT: the device may issue a DMA write request, then VT-d engine
tells it the target memory is readonly and result in VT-d fault.

Currnetly, there are two places will enable log dirty mode: migration and vram
tracking. Migration with device assigned is not allowed, so it is ok. For vram,
it doesn't need to set all memory to readonly. Only track the vram range is enough.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
---
 xen/arch/x86/mm/hap/hap.c       |   20 ++++++++++++++------
 xen/arch/x86/mm/paging.c        |    9 +++++----
 xen/arch/x86/mm/shadow/common.c |    2 +-
 xen/include/asm-x86/domain.h    |    2 +-
 xen/include/asm-x86/paging.h    |    5 +++--
 xen/include/asm-x86/shadow.h    |    2 +-
 6 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index d3f64bd..5f75636 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -82,7 +82,7 @@ int hap_track_dirty_vram(struct domain *d,
         if ( !paging_mode_log_dirty(d) )
         {
             hap_logdirty_init(d);
-            rc = paging_log_dirty_enable(d);
+            rc = paging_log_dirty_enable(d, 0);
             if ( rc )
                 goto out;
         }
@@ -167,17 +167,25 @@ out:
 /*            HAP LOG DIRTY SUPPORT             */
 /************************************************/
 
-/* hap code to call when log_dirty is enable. return 0 if no problem found. */
-static int hap_enable_log_dirty(struct domain *d)
+/*
+ * hap code to call when log_dirty is enable. return 0 if no problem found.
+ *
+ * NB: Domain that having device assigned should not set log_global. Because
+ * there is no way to track the memory updating from device.
+ */
+static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
 {
     /* turn on PG_log_dirty bit in paging mode */
     paging_lock(d);
     d->arch.paging.mode |= PG_log_dirty;
     paging_unlock(d);
 
-    /* set l1e entries of P2M table to be read-only. */
-    p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
-    flush_tlb_mask(d->domain_dirty_cpumask);
+    if ( log_global )
+    {
+        /* set l1e entries of P2M table to be read-only. */
+        p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
+        flush_tlb_mask(d->domain_dirty_cpumask);
+    }
     return 0;
 }
 
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index 21344e5..ab5eacb 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -164,7 +164,7 @@ void paging_free_log_dirty_bitmap(struct domain *d)
     paging_unlock(d);
 }
 
-int paging_log_dirty_enable(struct domain *d)
+int paging_log_dirty_enable(struct domain *d, bool_t log_global)
 {
     int ret;
 
@@ -172,7 +172,7 @@ int paging_log_dirty_enable(struct domain *d)
         return -EINVAL;
 
     domain_pause(d);
-    ret = d->arch.paging.log_dirty.enable_log_dirty(d);
+    ret = d->arch.paging.log_dirty.enable_log_dirty(d, log_global);
     domain_unpause(d);
 
     return ret;
@@ -489,7 +489,8 @@ void paging_log_dirty_range(struct domain *d,
  * These function pointers must not be followed with the log-dirty lock held.
  */
 void paging_log_dirty_init(struct domain *d,
-                           int    (*enable_log_dirty)(struct domain *d),
+                           int    (*enable_log_dirty)(struct domain *d,
+                                                      bool_t log_global),
                            int    (*disable_log_dirty)(struct domain *d),
                            void   (*clean_dirty_bitmap)(struct domain *d))
 {
@@ -590,7 +591,7 @@ int paging_domctl(struct domain *d, xen_domctl_shadow_op_t *sc,
     case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY:
         if ( hap_enabled(d) )
             hap_logdirty_init(d);
-        return paging_log_dirty_enable(d);
+        return paging_log_dirty_enable(d, 1);
 
     case XEN_DOMCTL_SHADOW_OP_OFF:
         if ( paging_mode_log_dirty(d) )
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 0bfa595..11c6b62 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -3418,7 +3418,7 @@ shadow_write_p2m_entry(struct vcpu *v, unsigned long gfn,
 /* Shadow specific code which is called in paging_log_dirty_enable().
  * Return 0 if no problem found.
  */
-int shadow_enable_log_dirty(struct domain *d)
+int shadow_enable_log_dirty(struct domain *d, bool_t log_global)
 {
     int ret;
 
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index ea72db2..4ff89f0 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -169,7 +169,7 @@ struct log_dirty_domain {
     unsigned int   dirty_count;
 
     /* functions which are paging mode specific */
-    int            (*enable_log_dirty   )(struct domain *d);
+    int            (*enable_log_dirty   )(struct domain *d, bool_t log_global);
     int            (*disable_log_dirty  )(struct domain *d);
     void           (*clean_dirty_bitmap )(struct domain *d);
 };
diff --git a/xen/include/asm-x86/paging.h b/xen/include/asm-x86/paging.h
index cd7ee3b..8dd2a61 100644
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -143,14 +143,15 @@ void paging_log_dirty_range(struct domain *d,
                             uint8_t *dirty_bitmap);
 
 /* enable log dirty */
-int paging_log_dirty_enable(struct domain *d);
+int paging_log_dirty_enable(struct domain *d, bool_t log_global);
 
 /* disable log dirty */
 int paging_log_dirty_disable(struct domain *d);
 
 /* log dirty initialization */
 void paging_log_dirty_init(struct domain *d,
-                           int  (*enable_log_dirty)(struct domain *d),
+                           int  (*enable_log_dirty)(struct domain *d,
+                                                    bool_t log_global),
                            int  (*disable_log_dirty)(struct domain *d),
                            void (*clean_dirty_bitmap)(struct domain *d));
 
diff --git a/xen/include/asm-x86/shadow.h b/xen/include/asm-x86/shadow.h
index 852023d..348915e 100644
--- a/xen/include/asm-x86/shadow.h
+++ b/xen/include/asm-x86/shadow.h
@@ -82,7 +82,7 @@ void shadow_teardown(struct domain *d);
 void shadow_final_teardown(struct domain *d);
 
 /* shadow code to call when log dirty is enabled */
-int shadow_enable_log_dirty(struct domain *d);
+int shadow_enable_log_dirty(struct domain *d, bool_t log_global);
 
 /* shadow code to call when log dirty is disabled */
 int shadow_disable_log_dirty(struct domain *d);
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: Tim Deegan <tim@xen.org>
To: Yang Zhang <yang.z.zhang@intel.com>
Cc: andrew.cooper3@citrix.com, JBeulich@suse.com, xiantao.zhang@intel.com, xen-devel@lists.xen.org
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 10 Feb 2014 09:03:14 +0100
Message-ID: <20140210080314.GA758@deinos.phlegethon.org>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote:
> From: Yang Zhang <yang.z.zhang@Intel.com>
> 
> When enabling log dirty mode, it sets all guest's memory to readonly.
> And in HAP enabled domain, it modifies all EPT entries to clear write bit
> to make sure it is readonly. This will cause problem if VT-d shares page
> table with EPT: the device may issue a DMA write request, then VT-d engine
> tells it the target memory is readonly and result in VT-d fault.

So that's a problem even if only the VGA framebuffer is being tracked
-- DMA from a passthrough device will either cause a spurious error or
fail to update the dirt bitmap. 

I think it would be better not to allow VT-d and EPT to share
pagetables in cases where devices are passed through (i.e. all cases
where VT-d is in use).

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Tim Deegan <tim@xen.org>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "JBeulich@suse.com" <JBeulich@suse.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 10 Feb 2014 08:15:16 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9D72C4@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Tim Deegan wrote on 2014-02-10:
> At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote:
>> From: Yang Zhang <yang.z.zhang@Intel.com>
>> 
>> When enabling log dirty mode, it sets all guest's memory to readonly.
>> And in HAP enabled domain, it modifies all EPT entries to clear
>> write bit to make sure it is readonly. This will cause problem if
>> VT-d shares page table with EPT: the device may issue a DMA write
>> request, then VT-d engine tells it the target memory is readonly and
>> result in VT-d
> fault.
> 
> So that's a problem even if only the VGA framebuffer is being tracked
> -- DMA from a passthrough device will either cause a spurious error or
> fail to update the dirt bitmap.

Do you mean the VGA frambuffer will be used as DMA buffer in guest? If yes, I think it's guest's responsibility to ensure it never happens.

> 
> I think it would be better not to allow VT-d and EPT to share
> pagetables in cases where devices are passed through (i.e. all cases where VT-d is in use).

Without VT-d and EPT share page, we still cannot track the memory updating from DMA. I think the point is that we cannot track the memory updating via DMA. So the user should use the log dirty mode carefully. Also, I am not sure whether the memory updating from dom0 and QEMU is tracked currently.

> 
> Tim.


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Tim Deegan <tim@xen.org>
Cc: Yang Zhang <yang.z.zhang@intel.com>, xen-devel@lists.xen.org, JBeulich@suse.com, xiantao.zhang@intel.com
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 10 Feb 2014 10:42:05 +0000
Message-ID: <52F8ACFD.7090801@citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 10/02/14 08:03, Tim Deegan wrote:
> At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote:
>> From: Yang Zhang <yang.z.zhang@Intel.com>
>>
>> When enabling log dirty mode, it sets all guest's memory to readonly.
>> And in HAP enabled domain, it modifies all EPT entries to clear write bit
>> to make sure it is readonly. This will cause problem if VT-d shares page
>> table with EPT: the device may issue a DMA write request, then VT-d engine
>> tells it the target memory is readonly and result in VT-d fault.
> So that's a problem even if only the VGA framebuffer is being tracked
> -- DMA from a passthrough device will either cause a spurious error or
> fail to update the dirt bitmap. 
>
> I think it would be better not to allow VT-d and EPT to share
> pagetables in cases where devices are passed through (i.e. all cases
> where VT-d is in use).
>
> Tim.

Sadly, this would make shared VT-d/EPT completely pointless as a
feature, causing extra memory overhead in Xen by having to maintain EPT
and IOMMU tables separately.

Any usecase which doesn't involve dirty vram tracking (e.g. headless VM
with SRIOV, PVH dom0) would be adversely affected.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <George.Dunlap@eu.citrix.com>
To: Yang Zhang <yang.z.zhang@intel.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jan Beulich <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 10 Feb 2014 16:13:50 +0000
Message-ID: <CAFLBxZYrQf6w3N_uz4=6eEyKL_rU6EcDoKSD90ASSUt1jOLBkw@mail.gmail.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

create ^
title it HAP dirty vram tracking breaks pass-through
thanks

On Mon, Feb 10, 2014 at 6:14 AM, Yang Zhang <yang.z.zhang@intel.com> wrote:
> From: Yang Zhang <yang.z.zhang@Intel.com>
>
> When enabling log dirty mode, it sets all guest's memory to readonly.
> And in HAP enabled domain, it modifies all EPT entries to clear write bit
> to make sure it is readonly. This will cause problem if VT-d shares page
> table with EPT: the device may issue a DMA write request, then VT-d engine
> tells it the target memory is readonly and result in VT-d fault.
>
> Currnetly, there are two places will enable log dirty mode: migration and vram
> tracking. Migration with device assigned is not allowed, so it is ok. For vram,
> it doesn't need to set all memory to readonly. Only track the vram range is enough.
>
> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
> ---
>  xen/arch/x86/mm/hap/hap.c       |   20 ++++++++++++++------
>  xen/arch/x86/mm/paging.c        |    9 +++++----
>  xen/arch/x86/mm/shadow/common.c |    2 +-
>  xen/include/asm-x86/domain.h    |    2 +-
>  xen/include/asm-x86/paging.h    |    5 +++--
>  xen/include/asm-x86/shadow.h    |    2 +-
>  6 files changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
> index d3f64bd..5f75636 100644
> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -82,7 +82,7 @@ int hap_track_dirty_vram(struct domain *d,
>          if ( !paging_mode_log_dirty(d) )
>          {
>              hap_logdirty_init(d);
> -            rc = paging_log_dirty_enable(d);
> +            rc = paging_log_dirty_enable(d, 0);
>              if ( rc )
>                  goto out;
>          }
> @@ -167,17 +167,25 @@ out:
>  /*            HAP LOG DIRTY SUPPORT             */
>  /************************************************/
>
> -/* hap code to call when log_dirty is enable. return 0 if no problem found. */
> -static int hap_enable_log_dirty(struct domain *d)
> +/*
> + * hap code to call when log_dirty is enable. return 0 if no problem found.
> + *
> + * NB: Domain that having device assigned should not set log_global. Because
> + * there is no way to track the memory updating from device.
> + */
> +static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
>  {
>      /* turn on PG_log_dirty bit in paging mode */
>      paging_lock(d);
>      d->arch.paging.mode |= PG_log_dirty;
>      paging_unlock(d);
>
> -    /* set l1e entries of P2M table to be read-only. */
> -    p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> -    flush_tlb_mask(d->domain_dirty_cpumask);
> +    if ( log_global )
> +    {
> +        /* set l1e entries of P2M table to be read-only. */
> +        p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> +        flush_tlb_mask(d->domain_dirty_cpumask);
> +    }
>      return 0;
>  }
>
> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
> index 21344e5..ab5eacb 100644
> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -164,7 +164,7 @@ void paging_free_log_dirty_bitmap(struct domain *d)
>      paging_unlock(d);
>  }
>
> -int paging_log_dirty_enable(struct domain *d)
> +int paging_log_dirty_enable(struct domain *d, bool_t log_global)
>  {
>      int ret;
>
> @@ -172,7 +172,7 @@ int paging_log_dirty_enable(struct domain *d)
>          return -EINVAL;
>
>      domain_pause(d);
> -    ret = d->arch.paging.log_dirty.enable_log_dirty(d);
> +    ret = d->arch.paging.log_dirty.enable_log_dirty(d, log_global);
>      domain_unpause(d);
>
>      return ret;
> @@ -489,7 +489,8 @@ void paging_log_dirty_range(struct domain *d,
>   * These function pointers must not be followed with the log-dirty lock held.
>   */
>  void paging_log_dirty_init(struct domain *d,
> -                           int    (*enable_log_dirty)(struct domain *d),
> +                           int    (*enable_log_dirty)(struct domain *d,
> +                                                      bool_t log_global),
>                             int    (*disable_log_dirty)(struct domain *d),
>                             void   (*clean_dirty_bitmap)(struct domain *d))
>  {
> @@ -590,7 +591,7 @@ int paging_domctl(struct domain *d, xen_domctl_shadow_op_t *sc,
>      case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY:
>          if ( hap_enabled(d) )
>              hap_logdirty_init(d);
> -        return paging_log_dirty_enable(d);
> +        return paging_log_dirty_enable(d, 1);
>
>      case XEN_DOMCTL_SHADOW_OP_OFF:
>          if ( paging_mode_log_dirty(d) )
> diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
> index 0bfa595..11c6b62 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -3418,7 +3418,7 @@ shadow_write_p2m_entry(struct vcpu *v, unsigned long gfn,
>  /* Shadow specific code which is called in paging_log_dirty_enable().
>   * Return 0 if no problem found.
>   */
> -int shadow_enable_log_dirty(struct domain *d)
> +int shadow_enable_log_dirty(struct domain *d, bool_t log_global)
>  {
>      int ret;
>
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index ea72db2..4ff89f0 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -169,7 +169,7 @@ struct log_dirty_domain {
>      unsigned int   dirty_count;
>
>      /* functions which are paging mode specific */
> -    int            (*enable_log_dirty   )(struct domain *d);
> +    int            (*enable_log_dirty   )(struct domain *d, bool_t log_global);
>      int            (*disable_log_dirty  )(struct domain *d);
>      void           (*clean_dirty_bitmap )(struct domain *d);
>  };
> diff --git a/xen/include/asm-x86/paging.h b/xen/include/asm-x86/paging.h
> index cd7ee3b..8dd2a61 100644
> --- a/xen/include/asm-x86/paging.h
> +++ b/xen/include/asm-x86/paging.h
> @@ -143,14 +143,15 @@ void paging_log_dirty_range(struct domain *d,
>                              uint8_t *dirty_bitmap);
>
>  /* enable log dirty */
> -int paging_log_dirty_enable(struct domain *d);
> +int paging_log_dirty_enable(struct domain *d, bool_t log_global);
>
>  /* disable log dirty */
>  int paging_log_dirty_disable(struct domain *d);
>
>  /* log dirty initialization */
>  void paging_log_dirty_init(struct domain *d,
> -                           int  (*enable_log_dirty)(struct domain *d),
> +                           int  (*enable_log_dirty)(struct domain *d,
> +                                                    bool_t log_global),
>                             int  (*disable_log_dirty)(struct domain *d),
>                             void (*clean_dirty_bitmap)(struct domain *d));
>
> diff --git a/xen/include/asm-x86/shadow.h b/xen/include/asm-x86/shadow.h
> index 852023d..348915e 100644
> --- a/xen/include/asm-x86/shadow.h
> +++ b/xen/include/asm-x86/shadow.h
> @@ -82,7 +82,7 @@ void shadow_teardown(struct domain *d);
>  void shadow_final_teardown(struct domain *d);
>
>  /* shadow code to call when log dirty is enabled */
> -int shadow_enable_log_dirty(struct domain *d);
> +int shadow_enable_log_dirty(struct domain *d, bool_t log_global);
>
>  /* shadow code to call when log dirty is disabled */
>  int shadow_disable_log_dirty(struct domain *d);
> --
> 1.7.1
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


From: Tim Deegan <tim@xen.org>
To: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "JBeulich@suse.com" <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 11 Feb 2014 10:02:02 +0100
Message-ID: <20140211090202.GC92054@deinos.phlegethon.org>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

At 08:15 +0000 on 10 Feb (1392016516), Zhang, Yang Z wrote:
> Tim Deegan wrote on 2014-02-10:
> > At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote:
> >> From: Yang Zhang <yang.z.zhang@Intel.com>
> >> 
> >> When enabling log dirty mode, it sets all guest's memory to readonly.
> >> And in HAP enabled domain, it modifies all EPT entries to clear
> >> write bit to make sure it is readonly. This will cause problem if
> >> VT-d shares page table with EPT: the device may issue a DMA write
> >> request, then VT-d engine tells it the target memory is readonly and
> >> result in VT-d
> > fault.
> > 
> > So that's a problem even if only the VGA framebuffer is being tracked
> > -- DMA from a passthrough device will either cause a spurious error or
> > fail to update the dirt bitmap.
> 
> Do you mean the VGA frambuffer will be used as DMA buffer in guest? If yes, I think it's guest's responsibility to ensure it never happens.
> 

I don't think that works.  We can't expect arbitrary OSes to (a) know
they're running on Xen and (b) know that that means they can't DMA to
or from their framebuffers.

> Without VT-d and EPT share page, we still cannot track the memory
> updating from DMA.

Yeah, but at least we don't risk crashing the _host_ by throwing DMA
failures around. 

> I think the point is that we cannot track the
> memory updating via DMA. So the user should use the log dirty mode
> carefully. Also, I am not sure whether the memory updating from dom0
> and QEMU is tracked currently.

Yes, dom0 and qemu updates are tracked in the log-dirty bitmaps.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <George.Dunlap@eu.citrix.com>
To: Tim Deegan <tim@xen.org>
Cc: "JBeulich@suse.com" <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Zhang, Yang Z" <yang.z.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 11 Feb 2014 10:59:38 +0000
Message-ID: <CAFLBxZaS_Sa03VxmPqDBB0wWEdjwqEa__RbtYvwuMd=QDk9ghQ@mail.gmail.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On Tue, Feb 11, 2014 at 9:02 AM, Tim Deegan <tim@xen.org> wrote:
> At 08:15 +0000 on 10 Feb (1392016516), Zhang, Yang Z wrote:
>> Tim Deegan wrote on 2014-02-10:
>> > At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote:
>> >> From: Yang Zhang <yang.z.zhang@Intel.com>
>> >>
>> >> When enabling log dirty mode, it sets all guest's memory to readonly.
>> >> And in HAP enabled domain, it modifies all EPT entries to clear
>> >> write bit to make sure it is readonly. This will cause problem if
>> >> VT-d shares page table with EPT: the device may issue a DMA write
>> >> request, then VT-d engine tells it the target memory is readonly and
>> >> result in VT-d
>> > fault.
>> >
>> > So that's a problem even if only the VGA framebuffer is being tracked
>> > -- DMA from a passthrough device will either cause a spurious error or
>> > fail to update the dirt bitmap.
>>
>> Do you mean the VGA frambuffer will be used as DMA buffer in guest? If yes, I think it's guest's responsibility to ensure it never happens.
>>
>
> I don't think that works.  We can't expect arbitrary OSes to (a) know
> they're running on Xen and (b) know that that means they can't DMA to
> or from their framebuffers.
>
>> Without VT-d and EPT share page, we still cannot track the memory
>> updating from DMA.
>
> Yeah, but at least we don't risk crashing the _host_ by throwing DMA
> failures around.

What I'm missing here is what you think a proper solution is.  It seems we have:

A. Share EPT/IOMMU tables, only do log-dirty tracking on the buffer
being tracked, and hope the guest doesn't DMA into video ram; DMA
causes IOMMU fault. (This really shouldn't crash the host under normal
circumstances; if it does it's a hardware bug.)
B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA into
video ram.  DMA causes missed update to dirty bitmap, which will
hopefully just cause screen corruption.
C. Do buffer scanning rather than dirty vram tracking (SLOW)
D. Don't allow both a virtual video card and pass-through

Given that most operating systems will probably *not* DMA into video
ram, and that an IOMMU fault isn't *supposed* to be able to crash the
host, 'A' seems like the most reasonable option to me.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: Tim Deegan <tim@xen.org>
To: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Zhang, Yang Z" <yang.z.zhang@intel.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "JBeulich@suse.com" <JBeulich@suse.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 11 Feb 2014 12:55:53 +0100
Message-ID: <20140211115553.GB97288@deinos.phlegethon.org>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote:
> On Tue, Feb 11, 2014 at 9:02 AM, Tim Deegan <tim@xen.org> wrote:
> > At 08:15 +0000 on 10 Feb (1392016516), Zhang, Yang Z wrote:
> >> Tim Deegan wrote on 2014-02-10:
> >> > At 14:14 +0800 on 10 Feb (1392038040), Yang Zhang wrote:
> >> >> From: Yang Zhang <yang.z.zhang@Intel.com>
> >> >>
> >> >> When enabling log dirty mode, it sets all guest's memory to readonly.
> >> >> And in HAP enabled domain, it modifies all EPT entries to clear
> >> >> write bit to make sure it is readonly. This will cause problem if
> >> >> VT-d shares page table with EPT: the device may issue a DMA write
> >> >> request, then VT-d engine tells it the target memory is readonly and
> >> >> result in VT-d
> >> > fault.
> >> >
> >> > So that's a problem even if only the VGA framebuffer is being tracked
> >> > -- DMA from a passthrough device will either cause a spurious error or
> >> > fail to update the dirt bitmap.
> >>
> >> Do you mean the VGA frambuffer will be used as DMA buffer in guest? If yes, I think it's guest's responsibility to ensure it never happens.
> >>
> >
> > I don't think that works.  We can't expect arbitrary OSes to (a) know
> > they're running on Xen and (b) know that that means they can't DMA to
> > or from their framebuffers.
> >
> >> Without VT-d and EPT share page, we still cannot track the memory
> >> updating from DMA.
> >
> > Yeah, but at least we don't risk crashing the _host_ by throwing DMA
> > failures around.
> 
> What I'm missing here is what you think a proper solution is.

A _proper_ solution would be for the IOMMU h/w to allow restartable
faults, so that we can do all the usual fault-driven virtual memory
operations with DMA. :)  In the meantime...

>  It seems we have:
> 
> A. Share EPT/IOMMU tables, only do log-dirty tracking on the buffer
> being tracked, and hope the guest doesn't DMA into video ram; DMA
> causes IOMMU fault. (This really shouldn't crash the host under normal
> circumstances; if it does it's a hardware bug.)

Note "hope" and "shouldn't" there. :)

> B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA into
> video ram.  DMA causes missed update to dirty bitmap, which will
> hopefully just cause screen corruption.

Yep.  At a cost of about 0.2% in space and some extra bookkeeping
(for VMs that actually have devices passed through to them).
The extra bookkeeping could be expensive in some cases, but basically
all of those cases are already incompatible with IOMMU.

> C. Do buffer scanning rather than dirty vram tracking (SLOW)
> D. Don't allow both a virtual video card and pass-through

E. Share EPT and IOMMU tables until someone turns on log-dirty mode
and then split them out.  That one 

> Given that most operating systems will probably *not* DMA into video
> ram, and that an IOMMU fault isn't *supposed* to be able to crash the
> host, 'A' seems like the most reasonable option to me.

Meh, OK.  I prefer 'B' but 'A' is better than nothing, I guess, and
seems to have most support from other people.  On that basis this
patch can have my Ack.

Is it intended for 4.4 as a bugfix (i.e. should I be checking it in?)

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Tim Deegan" <tim@xen.org>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Yang Z Zhang <yang.z.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 11 Feb 2014 12:57:55 +0000
Message-ID: <52FA2C63020000780011B201@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 11.02.14 at 12:55, Tim Deegan <tim@xen.org> wrote:
> At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote:
>> What I'm missing here is what you think a proper solution is.
> 
> A _proper_ solution would be for the IOMMU h/w to allow restartable
> faults, so that we can do all the usual fault-driven virtual memory
> operations with DMA. :)  In the meantime...

Or maintaining the A/D bits for IOMMU side accesses too.

>>  It seems we have:
>> 
>> A. Share EPT/IOMMU tables, only do log-dirty tracking on the buffer
>> being tracked, and hope the guest doesn't DMA into video ram; DMA
>> causes IOMMU fault. (This really shouldn't crash the host under normal
>> circumstances; if it does it's a hardware bug.)
> 
> Note "hope" and "shouldn't" there. :)
> 
>> B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA into
>> video ram.  DMA causes missed update to dirty bitmap, which will
>> hopefully just cause screen corruption.
> 
> Yep.  At a cost of about 0.2% in space and some extra bookkeeping
> (for VMs that actually have devices passed through to them).
> The extra bookkeeping could be expensive in some cases, but basically
> all of those cases are already incompatible with IOMMU.
> 
>> C. Do buffer scanning rather than dirty vram tracking (SLOW)
>> D. Don't allow both a virtual video card and pass-through
> 
> E. Share EPT and IOMMU tables until someone turns on log-dirty mode
> and then split them out.  That one 

Wouldn't that be problematic in terms of memory being available,
namely when using ballooning in Dom0?

>> Given that most operating systems will probably *not* DMA into video
>> ram, and that an IOMMU fault isn't *supposed* to be able to crash the
>> host, 'A' seems like the most reasonable option to me.
> 
> Meh, OK.  I prefer 'B' but 'A' is better than nothing, I guess, and
> seems to have most support from other people.  On that basis this
> patch can have my Ack.

I too would consider B better than A.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Tim Deegan <tim@xen.org>, Jan Beulich <JBeulich@suse.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 11 Feb 2014 15:55:57 +0000
Message-ID: <52FA480D.9040707@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 02/11/2014 12:57 PM, Jan Beulich wrote:
>>>> On 11.02.14 at 12:55, Tim Deegan <tim@xen.org> wrote:
>> At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote:
>>> What I'm missing here is what you think a proper solution is.
>>
>> A _proper_ solution would be for the IOMMU h/w to allow restartable
>> faults, so that we can do all the usual fault-driven virtual memory
>> operations with DMA. :)  In the meantime...
>
> Or maintaining the A/D bits for IOMMU side accesses too.
>
>>>   It seems we have:
>>>
>>> A. Share EPT/IOMMU tables, only do log-dirty tracking on the buffer
>>> being tracked, and hope the guest doesn't DMA into video ram; DMA
>>> causes IOMMU fault. (This really shouldn't crash the host under normal
>>> circumstances; if it does it's a hardware bug.)
>>
>> Note "hope" and "shouldn't" there. :)
>>
>>> B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA into
>>> video ram.  DMA causes missed update to dirty bitmap, which will
>>> hopefully just cause screen corruption.
>>
>> Yep.  At a cost of about 0.2% in space and some extra bookkeeping
>> (for VMs that actually have devices passed through to them).
>> The extra bookkeeping could be expensive in some cases, but basically
>> all of those cases are already incompatible with IOMMU.
>>
>>> C. Do buffer scanning rather than dirty vram tracking (SLOW)
>>> D. Don't allow both a virtual video card and pass-through
>>
>> E. Share EPT and IOMMU tables until someone turns on log-dirty mode
>> and then split them out.  That one
>
> Wouldn't that be problematic in terms of memory being available,
> namely when using ballooning in Dom0?
>
>>> Given that most operating systems will probably *not* DMA into video
>>> ram, and that an IOMMU fault isn't *supposed* to be able to crash the
>>> host, 'A' seems like the most reasonable option to me.
>>
>> Meh, OK.  I prefer 'B' but 'A' is better than nothing, I guess, and
>> seems to have most support from other people.  On that basis this
>> patch can have my Ack.
>
> I too would consider B better than A.

I think I got a bit distracted with the "A isn't really so bad" thing. 
Actually, if the overhead of not sharing tables isn't very high, then B 
isn't such a bad option.  In fact, B is what I expected Yang to submit 
when he originally described the problem.

I was going to say, from a release perspective, B is probably the safest 
option for now.  But on the other hand, if we've been testing sharing 
all this time, maybe switching back over to non-sharing whole-hog has 
the higher risk?

Anyway, both are at least probably equal risk-wise.  How easy is it to 
implement?

  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Tim Deegan <tim@xen.org>, George Dunlap <george.dunlap@eu.citrix.com>, Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 12 Feb 2014 00:53:34 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9D98D6@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

George Dunlap wrote on 2014-02-11:
> On 02/11/2014 12:57 PM, Jan Beulich wrote:
>>>>> On 11.02.14 at 12:55, Tim Deegan <tim@xen.org> wrote:
>>> At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote:
>>>> What I'm missing here is what you think a proper solution is.
>>> 
>>> A _proper_ solution would be for the IOMMU h/w to allow restartable
>>> faults, so that we can do all the usual fault-driven virtual memory
>>> operations with DMA. :)  In the meantime...
>> 
>> Or maintaining the A/D bits for IOMMU side accesses too.
>> 
>>>>   It seems we have:
>>>> A. Share EPT/IOMMU tables, only do log-dirty tracking on the
>>>> buffer being tracked, and hope the guest doesn't DMA into video
>>>> ram; DMA causes IOMMU fault. (This really shouldn't crash the host
>>>> under normal circumstances; if it does it's a hardware bug.)
>>> 
>>> Note "hope" and "shouldn't" there. :)
>>> 
>>>> B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA
>>>> into video ram.  DMA causes missed update to dirty bitmap, which
>>>> will hopefully just cause screen corruption.
>>> 
>>> Yep.  At a cost of about 0.2% in space and some extra bookkeeping
>>> (for VMs that actually have devices passed through to them).
>>> The extra bookkeeping could be expensive in some cases, but
>>> basically all of those cases are already incompatible with IOMMU.
>>> 
>>>> C. Do buffer scanning rather than dirty vram tracking (SLOW) D.
>>>> Don't allow both a virtual video card and pass-through
>>> 
>>> E. Share EPT and IOMMU tables until someone turns on log-dirty mode
>>> and then split them out.  That one
>> 
>> Wouldn't that be problematic in terms of memory being available,
>> namely when using ballooning in Dom0?
>> 
>>>> Given that most operating systems will probably *not* DMA into
>>>> video ram, and that an IOMMU fault isn't *supposed* to be able to
>>>> crash the host, 'A' seems like the most reasonable option to me.
>>> 
>>> Meh, OK.  I prefer 'B' but 'A' is better than nothing, I guess, and
>>> seems to have most support from other people.  On that basis this
>>> patch can have my Ack.
>> 
>> I too would consider B better than A.
> 
> I think I got a bit distracted with the "A isn't really so bad" thing.
> Actually, if the overhead of not sharing tables isn't very high, then
> B isn't such a bad option.  In fact, B is what I expected Yang to
> submit when he originally described the problem.

Actually, the first solution came to my mind is B. Then I realized that even chose B, we still cannot track the memory updating from DMA(even with A/D bit, it still a problem). Also, considering the current usage case of log dirty in Xen(only vram tracking has problem), I though A is better.: Hypervisor only need to track the vram change. If a malicious guest try to DMA to vram range, it only crashed himself (This should be reasonable).

> 
> I was going to say, from a release perspective, B is probably the
> safest option for now.  But on the other hand, if we've been testing
> sharing all this time, maybe switching back over to non-sharing whole-hog has the higher risk?

Another problem with B is that current VT-d large paging supporting relies on the sharing EPT and VT-d page table. This means if we choose B, then we need to re-enable VT-d large page. This would be a huge performance impaction for Xen 4.4 on using VT-d solution.

> 
> Anyway, both are at least probably equal risk-wise.  How easy is it to
> implement?
> 
>   -George


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Tim Deegan <tim@xen.org>, "Zhang, Yang Z" <yang.z.zhang@intel.com>, Jan Beulich <JBeulich@suse.com>
Cc: "Zhang, Xiantao" <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Thu, 13 Feb 2014 15:46:06 +0000
Message-ID: <52FCE8BE.8050105@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 02/12/2014 12:53 AM, Zhang, Yang Z wrote:
> George Dunlap wrote on 2014-02-11:
>> On 02/11/2014 12:57 PM, Jan Beulich wrote:
>>>>>> On 11.02.14 at 12:55, Tim Deegan <tim@xen.org> wrote:
>>>> At 10:59 +0000 on 11 Feb (1392112778), George Dunlap wrote:
>>>>> What I'm missing here is what you think a proper solution is.
>>>> A _proper_ solution would be for the IOMMU h/w to allow restartable
>>>> faults, so that we can do all the usual fault-driven virtual memory
>>>> operations with DMA. :)  In the meantime...
>>> Or maintaining the A/D bits for IOMMU side accesses too.
>>>
>>>>>    It seems we have:
>>>>> A. Share EPT/IOMMU tables, only do log-dirty tracking on the
>>>>> buffer being tracked, and hope the guest doesn't DMA into video
>>>>> ram; DMA causes IOMMU fault. (This really shouldn't crash the host
>>>>> under normal circumstances; if it does it's a hardware bug.)
>>>> Note "hope" and "shouldn't" there. :)
>>>>
>>>>> B. Never share EPT/IOMMU tables, and hope the guest doesn't DMA
>>>>> into video ram.  DMA causes missed update to dirty bitmap, which
>>>>> will hopefully just cause screen corruption.
>>>> Yep.  At a cost of about 0.2% in space and some extra bookkeeping
>>>> (for VMs that actually have devices passed through to them).
>>>> The extra bookkeeping could be expensive in some cases, but
>>>> basically all of those cases are already incompatible with IOMMU.
>>>>
>>>>> C. Do buffer scanning rather than dirty vram tracking (SLOW) D.
>>>>> Don't allow both a virtual video card and pass-through
>>>> E. Share EPT and IOMMU tables until someone turns on log-dirty mode
>>>> and then split them out.  That one
>>> Wouldn't that be problematic in terms of memory being available,
>>> namely when using ballooning in Dom0?
>>>
>>>>> Given that most operating systems will probably *not* DMA into
>>>>> video ram, and that an IOMMU fault isn't *supposed* to be able to
>>>>> crash the host, 'A' seems like the most reasonable option to me.
>>>> Meh, OK.  I prefer 'B' but 'A' is better than nothing, I guess, and
>>>> seems to have most support from other people.  On that basis this
>>>> patch can have my Ack.
>>> I too would consider B better than A.
>> I think I got a bit distracted with the "A isn't really so bad" thing.
>> Actually, if the overhead of not sharing tables isn't very high, then
>> B isn't such a bad option.  In fact, B is what I expected Yang to
>> submit when he originally described the problem.
> Actually, the first solution came to my mind is B. Then I realized that even chose B, we still cannot track the memory updating from DMA(even with A/D bit, it still a problem). Also, considering the current usage case of log dirty in Xen(only vram tracking has problem), I though A is better.: Hypervisor only need to track the vram change. If a malicious guest try to DMA to vram range, it only crashed himself (This should be reasonable).
>
>> I was going to say, from a release perspective, B is probably the
>> safest option for now.  But on the other hand, if we've been testing
>> sharing all this time, maybe switching back over to non-sharing whole-hog has the higher risk?
> Another problem with B is that current VT-d large paging supporting relies on the sharing EPT and VT-d page table. This means if we choose B, then we need to re-enable VT-d large page. This would be a huge performance impaction for Xen 4.4 on using VT-d solution.

OK -- if that's the case, then it definitely tips the balance back to 
A.  Unless Tim or Jan disagrees, can one of you two check it in?

Don't rush your judgement; but it would be nice to have this in before 
RC4, which would mean checking it in today preferrably, or early 
tomorrow at the latest.

  -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>, "Tim Deegan" <tim@xen.org>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Thu, 13 Feb 2014 15:55:43 +0000
Message-ID: <52FCF90F020000780011C29A@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote:
> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote:
>> George Dunlap wrote on 2014-02-11:
>>> I think I got a bit distracted with the "A isn't really so bad" thing.
>>> Actually, if the overhead of not sharing tables isn't very high, then
>>> B isn't such a bad option.  In fact, B is what I expected Yang to
>>> submit when he originally described the problem.
>> Actually, the first solution came to my mind is B. Then I realized that even 
> chose B, we still cannot track the memory updating from DMA(even with A/D 
> bit, it still a problem). Also, considering the current usage case of log 
> dirty in Xen(only vram tracking has problem), I though A is better.: 
> Hypervisor only need to track the vram change. If a malicious guest try to 
> DMA to vram range, it only crashed himself (This should be reasonable).
>>
>>> I was going to say, from a release perspective, B is probably the
>>> safest option for now.  But on the other hand, if we've been testing
>>> sharing all this time, maybe switching back over to non-sharing whole-hog has 
> the higher risk?
>> Another problem with B is that current VT-d large paging supporting relies on 
> the sharing EPT and VT-d page table. This means if we choose B, then we need 
> to re-enable VT-d large page. This would be a huge performance impaction for 
> Xen 4.4 on using VT-d solution.
> 
> OK -- if that's the case, then it definitely tips the balance back to 
> A.  Unless Tim or Jan disagrees, can one of you two check it in?
> 
> Don't rush your judgement; but it would be nice to have this in before 
> RC4, which would mean checking it in today preferrably, or early 
> tomorrow at the latest.

That would be Tim then, as he would have to approve of it anyway.
I should also say that while I certainly understand the argumentation
above, I would still want to go this route only with the promise that
B is going to be worked on reasonably soon after the release, ideally
with the goal of backporting the changes for 4.4.1.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: Tim Deegan <tim@xen.org>
To: Jan Beulich <JBeulich@suse.com>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, George Dunlap <george.dunlap@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Thu, 13 Feb 2014 17:20:22 +0100
Message-ID: <20140213162022.GE82703@deinos.phlegethon.org>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote:
> >>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote:
> > On 02/12/2014 12:53 AM, Zhang, Yang Z wrote:
> >> George Dunlap wrote on 2014-02-11:
> >>> I think I got a bit distracted with the "A isn't really so bad" thing.
> >>> Actually, if the overhead of not sharing tables isn't very high, then
> >>> B isn't such a bad option.  In fact, B is what I expected Yang to
> >>> submit when he originally described the problem.
> >> Actually, the first solution came to my mind is B. Then I realized that even 
> > chose B, we still cannot track the memory updating from DMA(even with A/D 
> > bit, it still a problem). Also, considering the current usage case of log 
> > dirty in Xen(only vram tracking has problem), I though A is better.: 
> > Hypervisor only need to track the vram change. If a malicious guest try to 
> > DMA to vram range, it only crashed himself (This should be reasonable).
> >>
> >>> I was going to say, from a release perspective, B is probably the
> >>> safest option for now.  But on the other hand, if we've been testing
> >>> sharing all this time, maybe switching back over to non-sharing whole-hog has 
> > the higher risk?
> >> Another problem with B is that current VT-d large paging supporting relies on 
> > the sharing EPT and VT-d page table. This means if we choose B, then we need 
> > to re-enable VT-d large page. This would be a huge performance impaction for 
> > Xen 4.4 on using VT-d solution.
> > 
> > OK -- if that's the case, then it definitely tips the balance back to 
> > A.  Unless Tim or Jan disagrees, can one of you two check it in?
> > 
> > Don't rush your judgement; but it would be nice to have this in before 
> > RC4, which would mean checking it in today preferrably, or early 
> > tomorrow at the latest.
> 
> That would be Tim then, as he would have to approve of it anyway.

Done.

> I should also say that while I certainly understand the argumentation
> above, I would still want to go this route only with the promise that
> B is going to be worked on reasonably soon after the release, ideally
> with the goal of backporting the changes for 4.4.1.

Agreed.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Jan Beulich <JBeulich@suse.com>, Tim Deegan <tim@xen.org>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Yang Z Zhang <yang.z.zhang@intel.com>, xenbugs <xen@bugs.xenproject.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Thu, 13 Feb 2014 16:25:52 +0000
Message-ID: <52FCF210.7090702@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

title 38 Implement VT-d large pages so we can avoid sharing between EPT 
and IOMMU
owner it Yang Z Zhang <yang.z.zhang@intel.com>
thanks

On 02/13/2014 04:20 PM, Tim Deegan wrote:
> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote:
>>>>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote:
>>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote:
>>>> George Dunlap wrote on 2014-02-11:
>>>>> I think I got a bit distracted with the "A isn't really so bad" thing.
>>>>> Actually, if the overhead of not sharing tables isn't very high, then
>>>>> B isn't such a bad option.  In fact, B is what I expected Yang to
>>>>> submit when he originally described the problem.
>>>> Actually, the first solution came to my mind is B. Then I realized that even
>>> chose B, we still cannot track the memory updating from DMA(even with A/D
>>> bit, it still a problem). Also, considering the current usage case of log
>>> dirty in Xen(only vram tracking has problem), I though A is better.:
>>> Hypervisor only need to track the vram change. If a malicious guest try to
>>> DMA to vram range, it only crashed himself (This should be reasonable).
>>>>> I was going to say, from a release perspective, B is probably the
>>>>> safest option for now.  But on the other hand, if we've been testing
>>>>> sharing all this time, maybe switching back over to non-sharing whole-hog has
>>> the higher risk?
>>>> Another problem with B is that current VT-d large paging supporting relies on
>>> the sharing EPT and VT-d page table. This means if we choose B, then we need
>>> to re-enable VT-d large page. This would be a huge performance impaction for
>>> Xen 4.4 on using VT-d solution.
>>>
>>> OK -- if that's the case, then it definitely tips the balance back to
>>> A.  Unless Tim or Jan disagrees, can one of you two check it in?
>>>
>>> Don't rush your judgement; but it would be nice to have this in before
>>> RC4, which would mean checking it in today preferrably, or early
>>> tomorrow at the latest.
>> That would be Tim then, as he would have to approve of it anyway.
> Done.
>
>> I should also say that while I certainly understand the argumentation
>> above, I would still want to go this route only with the promise that
>> B is going to be worked on reasonably soon after the release, ideally
>> with the goal of backporting the changes for 4.4.1.
> Agreed.

OK -- I've retitled the bug and am going to leave it open.

  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


From: "Jan Beulich" <JBeulich@suse.com>
To: "George Dunlap" <george.dunlap@eu.citrix.com>, "Yang Z Zhang" <yang.z.zhang@intel.com>, "Tim Deegan" <tim@xen.org>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 17 Feb 2014 10:18:24 +0000
Message-ID: <5301F000020000780011CCE0@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 13.02.14 at 17:20, Tim Deegan <tim@xen.org> wrote:
> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote:
>> >>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote:
>> > On 02/12/2014 12:53 AM, Zhang, Yang Z wrote:
>> >> George Dunlap wrote on 2014-02-11:
>> >>> I think I got a bit distracted with the "A isn't really so bad" thing.
>> >>> Actually, if the overhead of not sharing tables isn't very high, then
>> >>> B isn't such a bad option.  In fact, B is what I expected Yang to
>> >>> submit when he originally described the problem.
>> >> Actually, the first solution came to my mind is B. Then I realized that 
> even 
>> > chose B, we still cannot track the memory updating from DMA(even with A/D 
>> > bit, it still a problem). Also, considering the current usage case of log 
>> > dirty in Xen(only vram tracking has problem), I though A is better.: 
>> > Hypervisor only need to track the vram change. If a malicious guest try to 
>> > DMA to vram range, it only crashed himself (This should be reasonable).
>> >>
>> >>> I was going to say, from a release perspective, B is probably the
>> >>> safest option for now.  But on the other hand, if we've been testing
>> >>> sharing all this time, maybe switching back over to non-sharing whole-hog has 
> 
>> > the higher risk?
>> >> Another problem with B is that current VT-d large paging supporting relies 
> on 
>> > the sharing EPT and VT-d page table. This means if we choose B, then we need 
> 
>> > to re-enable VT-d large page. This would be a huge performance impaction for 
>> > Xen 4.4 on using VT-d solution.
>> > 
>> > OK -- if that's the case, then it definitely tips the balance back to 
>> > A.  Unless Tim or Jan disagrees, can one of you two check it in?
>> > 
>> > Don't rush your judgement; but it would be nice to have this in before 
>> > RC4, which would mean checking it in today preferrably, or early 
>> > tomorrow at the latest.
>> 
>> That would be Tim then, as he would have to approve of it anyway.
> 
> Done.

Actually I'm afraid there are two problems with this patch:

For one, is enabling "global" log dirty mode still going to work
after VRAM-only mode already got enabled? I ask because the
paging_mode_log_dirty() check which paging_log_dirty_enable()
does first thing suggests otherwise to me (i.e. the now
conditional setting of all p2m entries to p2m_ram_logdirty would
seem to never get executed). IOW I would think that we're now
lacking a control operation allowing the transition from dirty VRAM
tracking mode to full log dirty mode.

And second, I have been fighting with finding both conditions
and (eventually) the root cause of a severe performance
regression (compared to 4.3.x) I'm observing on an EPT+IOMMU
system. This became _much_ worse after adding in the patch here
(while in fact I had hoped it might help with the originally observed
degradation): X startup fails due to timing out, and booting the
guest now takes about 20 minutes). I didn't find the root cause of
this yet, but meanwhile I know that
- the same isn't observable on SVM
- there's no problem when forcing the domain to use shadow
  mode
- there's no need for any device to actually be assigned to the
  guest
- the regression is very likely purely graphics related (based on
  the observation that when running something that regularly but
  not heavily updates the screen with X up, the guest consumes a
  full CPU's worth of processing power, yet when that updating
  doesn't happen, CPU consumption goes down, and it goes further
  down when shutting down X altogether - at least as log as the
  patch here doesn't get involved).
This I'm observing on a Westmere box (and I didn't notice it earlier
because that's one of those where due to a chipset erratum the
IOMMU gets turned off by default), so it's possible that this can't
be seen on more modern hardware. I'll hopefully find time today to
check this on the one newer (Sandy Bridge) box I have.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Tim Deegan <tim@xen.org>, Yang Z Zhang <yang.z.zhang@intel.com>, Jan Beulich <JBeulich@suse.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 17 Feb 2014 12:23:45 +0000
Message-ID: <5301FF51.1060509@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 02/17/2014 10:18 AM, Jan Beulich wrote:
>>>> On 13.02.14 at 17:20, Tim Deegan <tim@xen.org> wrote:
>> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote:
>>>>>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote:
>>>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote:
>>>>> George Dunlap wrote on 2014-02-11:
>>>>>> I think I got a bit distracted with the "A isn't really so bad" thing.
>>>>>> Actually, if the overhead of not sharing tables isn't very high, then
>>>>>> B isn't such a bad option.  In fact, B is what I expected Yang to
>>>>>> submit when he originally described the problem.
>>>>> Actually, the first solution came to my mind is B. Then I realized that
>> even
>>>> chose B, we still cannot track the memory updating from DMA(even with A/D
>>>> bit, it still a problem). Also, considering the current usage case of log
>>>> dirty in Xen(only vram tracking has problem), I though A is better.:
>>>> Hypervisor only need to track the vram change. If a malicious guest try to
>>>> DMA to vram range, it only crashed himself (This should be reasonable).
>>>>>> I was going to say, from a release perspective, B is probably the
>>>>>> safest option for now.  But on the other hand, if we've been testing
>>>>>> sharing all this time, maybe switching back over to non-sharing whole-hog has
>>>> the higher risk?
>>>>> Another problem with B is that current VT-d large paging supporting relies
>> on
>>>> the sharing EPT and VT-d page table. This means if we choose B, then we need
>>>> to re-enable VT-d large page. This would be a huge performance impaction for
>>>> Xen 4.4 on using VT-d solution.
>>>>
>>>> OK -- if that's the case, then it definitely tips the balance back to
>>>> A.  Unless Tim or Jan disagrees, can one of you two check it in?
>>>>
>>>> Don't rush your judgement; but it would be nice to have this in before
>>>> RC4, which would mean checking it in today preferrably, or early
>>>> tomorrow at the latest.
>>> That would be Tim then, as he would have to approve of it anyway.
>> Done.
> Actually I'm afraid there are two problems with this patch:
>
> For one, is enabling "global" log dirty mode still going to work
> after VRAM-only mode already got enabled? I ask because the
> paging_mode_log_dirty() check which paging_log_dirty_enable()
> does first thing suggests otherwise to me (i.e. the now
> conditional setting of all p2m entries to p2m_ram_logdirty would
> seem to never get executed). IOW I would think that we're now
> lacking a control operation allowing the transition from dirty VRAM
> tracking mode to full log dirty mode.

Hmm, yes, doing a code inspection, that would appear to be the case.  
This probably wouldn't be caught by osstest, because (as I understand 
it) we never attach to the display, so dirty vram tracking is probably 
never enabled.

> And second, I have been fighting with finding both conditions
> and (eventually) the root cause of a severe performance
> regression (compared to 4.3.x) I'm observing on an EPT+IOMMU
> system. This became _much_ worse after adding in the patch here
> (while in fact I had hoped it might help with the originally observed
> degradation): X startup fails due to timing out, and booting the
> guest now takes about 20 minutes). I didn't find the root cause of
> this yet, but meanwhile I know that
> - the same isn't observable on SVM
> - there's no problem when forcing the domain to use shadow
>    mode
> - there's no need for any device to actually be assigned to the
>    guest
> - the regression is very likely purely graphics related (based on
>    the observation that when running something that regularly but
>    not heavily updates the screen with X up, the guest consumes a
>    full CPU's worth of processing power, yet when that updating
>    doesn't happen, CPU consumption goes down, and it goes further
>    down when shutting down X altogether - at least as log as the
>    patch here doesn't get involved).
> This I'm observing on a Westmere box (and I didn't notice it earlier
> because that's one of those where due to a chipset erratum the
> IOMMU gets turned off by default), so it's possible that this can't
> be seen on more modern hardware. I'll hopefully find time today to
> check this on the one newer (Sandy Bridge) box I have.

So you're saying that the slowdown happens if you have EPT+IOMMU, but 
*not* if you have EPT alone (IOMMU disabled), or shadow + IOMMU?

I have an issue I haven't had time to look into where windows installs 
are sometimes terribly slow on my Nehalem box; but it seems to be only 
with qemu-xen, not qemu-traditional.  I haven't tried with shadow.

  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>, "Tim Deegan" <tim@xen.org>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 17 Feb 2014 12:37:44 +0000
Message-ID: <530210A8020000780011CDD5@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 17.02.14 at 13:23, George Dunlap <george.dunlap@eu.citrix.com> wrote:
> On 02/17/2014 10:18 AM, Jan Beulich wrote:
>> And second, I have been fighting with finding both conditions
>> and (eventually) the root cause of a severe performance
>> regression (compared to 4.3.x) I'm observing on an EPT+IOMMU
>> system. This became _much_ worse after adding in the patch here
>> (while in fact I had hoped it might help with the originally observed
>> degradation): X startup fails due to timing out, and booting the
>> guest now takes about 20 minutes). I didn't find the root cause of
>> this yet, but meanwhile I know that
>> - the same isn't observable on SVM
>> - there's no problem when forcing the domain to use shadow
>>    mode
>> - there's no need for any device to actually be assigned to the
>>    guest
>> - the regression is very likely purely graphics related (based on
>>    the observation that when running something that regularly but
>>    not heavily updates the screen with X up, the guest consumes a
>>    full CPU's worth of processing power, yet when that updating
>>    doesn't happen, CPU consumption goes down, and it goes further
>>    down when shutting down X altogether - at least as log as the
>>    patch here doesn't get involved).
>> This I'm observing on a Westmere box (and I didn't notice it earlier
>> because that's one of those where due to a chipset erratum the
>> IOMMU gets turned off by default), so it's possible that this can't
>> be seen on more modern hardware. I'll hopefully find time today to
>> check this on the one newer (Sandy Bridge) box I have.
> 
> So you're saying that the slowdown happens if you have EPT+IOMMU, but 
> *not* if you have EPT alone (IOMMU disabled), or shadow + IOMMU?
> 
> I have an issue I haven't had time to look into where windows installs 
> are sometimes terribly slow on my Nehalem box; but it seems to be only 
> with qemu-xen, not qemu-traditional.  I haven't tried with shadow.

Sorry I forgot to mention this - I too suspected the qemu version
update to be one possible reason, but I'm seeing the same behavior
with qemu-trad.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Jan Beulich <JBeulich@suse.com>, Yang Z Zhang <yang.z.zhang@intel.com>, Tim Deegan <tim@xen.org>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 17 Feb 2014 14:51:53 +0000
Message-ID: <53022209.1060005@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 02/17/2014 10:18 AM, Jan Beulich wrote:
>>>> On 13.02.14 at 17:20, Tim Deegan <tim@xen.org> wrote:
>> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote:
>>>>>> On 13.02.14 at 16:46, George Dunlap <george.dunlap@eu.citrix.com> wrote:
>>>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote:
>>>>> George Dunlap wrote on 2014-02-11:
>>>>>> I think I got a bit distracted with the "A isn't really so bad" thing.
>>>>>> Actually, if the overhead of not sharing tables isn't very high, then
>>>>>> B isn't such a bad option.  In fact, B is what I expected Yang to
>>>>>> submit when he originally described the problem.
>>>>> Actually, the first solution came to my mind is B. Then I realized that
>> even
>>>> chose B, we still cannot track the memory updating from DMA(even with A/D
>>>> bit, it still a problem). Also, considering the current usage case of log
>>>> dirty in Xen(only vram tracking has problem), I though A is better.:
>>>> Hypervisor only need to track the vram change. If a malicious guest try to
>>>> DMA to vram range, it only crashed himself (This should be reasonable).
>>>>>> I was going to say, from a release perspective, B is probably the
>>>>>> safest option for now.  But on the other hand, if we've been testing
>>>>>> sharing all this time, maybe switching back over to non-sharing whole-hog has
>>>> the higher risk?
>>>>> Another problem with B is that current VT-d large paging supporting relies
>> on
>>>> the sharing EPT and VT-d page table. This means if we choose B, then we need
>>>> to re-enable VT-d large page. This would be a huge performance impaction for
>>>> Xen 4.4 on using VT-d solution.
>>>>
>>>> OK -- if that's the case, then it definitely tips the balance back to
>>>> A.  Unless Tim or Jan disagrees, can one of you two check it in?
>>>>
>>>> Don't rush your judgement; but it would be nice to have this in before
>>>> RC4, which would mean checking it in today preferrably, or early
>>>> tomorrow at the latest.
>>> That would be Tim then, as he would have to approve of it anyway.
>> Done.
> Actually I'm afraid there are two problems with this patch:
>
> For one, is enabling "global" log dirty mode still going to work
> after VRAM-only mode already got enabled? I ask because the
> paging_mode_log_dirty() check which paging_log_dirty_enable()
> does first thing suggests otherwise to me (i.e. the now
> conditional setting of all p2m entries to p2m_ram_logdirty would
> seem to never get executed). IOW I would think that we're now
> lacking a control operation allowing the transition from dirty VRAM
> tracking mode to full log dirty mode.

Hrm, will so far playing with this I've been unable to get a localhost 
migrate to fail with the vncviewer attached.  Which seems a bit strange...

  -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Xiantao Zhang" <xiantao.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>, "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Donald D Dugger <donald.d.dugger@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 17 Feb 2014 15:00:57 +0000
Message-ID: <53023239020000780011CED9@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote:
> And second, I have been fighting with finding both conditions
> and (eventually) the root cause of a severe performance
> regression (compared to 4.3.x) I'm observing on an EPT+IOMMU
> system. This became _much_ worse after adding in the patch here
> (while in fact I had hoped it might help with the originally observed
> degradation): X startup fails due to timing out, and booting the
> guest now takes about 20 minutes). I didn't find the root cause of
> this yet, but meanwhile I know that
> - the same isn't observable on SVM
> - there's no problem when forcing the domain to use shadow
>   mode
> - there's no need for any device to actually be assigned to the
>   guest
> - the regression is very likely purely graphics related (based on
>   the observation that when running something that regularly but
>   not heavily updates the screen with X up, the guest consumes a
>   full CPU's worth of processing power, yet when that updating
>   doesn't happen, CPU consumption goes down, and it goes further
>   down when shutting down X altogether - at least as log as the
>   patch here doesn't get involved).
> This I'm observing on a Westmere box (and I didn't notice it earlier
> because that's one of those where due to a chipset erratum the
> IOMMU gets turned off by default), so it's possible that this can't
> be seen on more modern hardware. I'll hopefully find time today to
> check this on the one newer (Sandy Bridge) box I have.

Just got done with trying this: By default, things work fine there.
As soon as I use "iommu=no-snoop", things go bad (even worse
than one the older box - the guest is consuming about 2.5 CPUs
worth of processing power _without_ the patch here in use, so I
don't even want to think about trying it there); I guessed that to
be another of the potential sources of the problem since on that
older box the respective hardware feature is unavailable.

While I'll try to look into this further, I guess I have to defer to our
VT-d specialists at Intel at this point...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Tim Deegan" <tim@xen.org>, "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Xiantao Zhang <xiantao.zhang@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 17 Feb 2014 15:05:01 +0000
Message-ID: <5302332D020000780011CEF1@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 17.02.14 at 15:51, George Dunlap <george.dunlap@eu.citrix.com> wrote:
> On 02/17/2014 10:18 AM, Jan Beulich wrote:
>> Actually I'm afraid there are two problems with this patch:
>>
>> For one, is enabling "global" log dirty mode still going to work
>> after VRAM-only mode already got enabled? I ask because the
>> paging_mode_log_dirty() check which paging_log_dirty_enable()
>> does first thing suggests otherwise to me (i.e. the now
>> conditional setting of all p2m entries to p2m_ram_logdirty would
>> seem to never get executed). IOW I would think that we're now
>> lacking a control operation allowing the transition from dirty VRAM
>> tracking mode to full log dirty mode.
> 
> Hrm, will so far playing with this I've been unable to get a localhost 
> migrate to fail with the vncviewer attached.  Which seems a bit strange...

Not necessarily - it may depend on how the tools actually do this:
They might temporarily disable log dirty mode altogether, just to
re-enable full mode again right away. But this specific usage of the
hypervisor interface wouldn't (to me) mean that other tool stacks
might not be doing this differently.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Jan Beulich <JBeulich@suse.com>, George Dunlap <george.dunlap@eu.citrix.com>, Tim Deegan <tim@xen.org>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "Dugger, Donald D" <donald.d.dugger@intel.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 18 Feb 2014 03:14:47 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9ED8B0@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Jan Beulich wrote on 2014-02-17:
>>>> On 17.02.14 at 15:51, George Dunlap <george.dunlap@eu.citrix.com>
> wrote:
>> On 02/17/2014 10:18 AM, Jan Beulich wrote:
>>> Actually I'm afraid there are two problems with this patch:
>>> 
>>> For one, is enabling "global" log dirty mode still going to work
>>> after VRAM-only mode already got enabled? I ask because the
>>> paging_mode_log_dirty() check which paging_log_dirty_enable() does
>>> first thing suggests otherwise to me (i.e. the now conditional
>>> setting of all p2m entries to p2m_ram_logdirty would seem to never
>>> get executed). IOW I would think that we're now lacking a control
>>> operation allowing the transition from dirty VRAM tracking mode to
>>> full log dirty mode.
>> 
>> Hrm, will so far playing with this I've been unable to get a
>> localhost migrate to fail with the vncviewer attached.  Which seems a bit strange...
> 
> Not necessarily - it may depend on how the tools actually do this:
> They might temporarily disable log dirty mode altogether, just to
> re-enable full mode again right away. But this specific usage of the
> hypervisor interface wouldn't (to me) mean that other tool stacks
> might not be doing this differently.

You are right. Before migration, libxc will disable log dirty mode if it already enabled before and re-enable it. So when I am developing this patch, I think it is ok for migration.

If there really have other tool stacks also will use this interface (Is it true?), perhaps my original patch is better which will check paging_mode_log_dirty(d) && log_global:

diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index ab5eacb..368c975 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -168,7 +168,7 @@ int paging_log_dirty_enable(struct domain *d, bool_t log_global)
 {
     int ret;
 
-    if ( paging_mode_log_dirty(d) )
+    if ( paging_mode_log_dirty(d) && !log_global )
         return -EINVAL;
 
     domain_pause(d);


> 
> Jan


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: George Dunlap <george.dunlap@eu.citrix.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, Jan Beulich <JBeulich@suse.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Dugger, Donald D" <donald.d.dugger@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 18 Feb 2014 03:25:21 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9ED8D3@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Jan Beulich wrote on 2014-02-17:
>>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote:
>> And second, I have been fighting with finding both conditions and
>> (eventually) the root cause of a severe performance regression
>> (compared to 4.3.x) I'm observing on an EPT+IOMMU system. This
>> became _much_ worse after adding in the patch here (while in fact I
>> had hoped it might help with the originally observed
>> degradation): X startup fails due to timing out, and booting the
>> guest now takes about 20 minutes). I didn't find the root cause of
>> this yet, but meanwhile I know that
>> - the same isn't observable on SVM
>> - there's no problem when forcing the domain to use shadow
>>   mode - there's no need for any device to actually be assigned to the
>>   guest - the regression is very likely purely graphics related (based
>>   on the observation that when running something that regularly but not
>>   heavily updates the screen with X up, the guest consumes a full CPU's
>>   worth of processing power, yet when that updating doesn't happen, CPU
>>   consumption goes down, and it goes further down when shutting down X
>>   altogether - at least as log as the patch here doesn't get involved).
>> This I'm observing on a Westmere box (and I didn't notice it earlier
>> because that's one of those where due to a chipset erratum the IOMMU
>> gets turned off by default), so it's possible that this can't be
>> seen on more modern hardware. I'll hopefully find time today to
>> check this on the one newer (Sandy Bridge) box I have.
> 
> Just got done with trying this: By default, things work fine there.
> As soon as I use "iommu=no-snoop", things go bad (even worse than one
> the older box - the guest is consuming about 2.5 CPUs worth of
> processing power _without_ the patch here in use, so I don't even want
> to think about trying it there); I guessed that to be another of the
> potential sources of the problem since on that older box the respective hardware feature is unavailable.
> 
> While I'll try to look into this further, I guess I have to defer to
> our VT-d specialists at Intel at this point...
> 

Hi, Jan,

I tried to reproduce it. But unfortunately, I cannot reproduce it in my box (sandy bridge EP)with latest Xen(include my patch). I guess my configuration or steps may wrong, here is mine:

1. add iommu=1,no-snoop in by xen cmd line:
(XEN) Intel VT-d Snoop Control not enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Shared EPT tables enabled.

2. boot a rhel6u4 guest.

3. after guest boot up, run startx inside guest.

4. a few second, the X windows shows and didn't see any error. Also the CPU utilization is about 1.7%.

Any thing wrong?

> Jan


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Xiantao Zhang" <xiantao.zhang@intel.com>, "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Donald D Dugger <donald.d.dugger@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 18 Feb 2014 08:30:56 +0000
Message-ID: <53032850020000780011D1FD@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 17.02.14 at 16:00, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote:
>> This I'm observing on a Westmere box (and I didn't notice it earlier
>> because that's one of those where due to a chipset erratum the
>> IOMMU gets turned off by default), so it's possible that this can't
>> be seen on more modern hardware. I'll hopefully find time today to
>> check this on the one newer (Sandy Bridge) box I have.
> 
> Just got done with trying this: By default, things work fine there.
> As soon as I use "iommu=no-snoop", things go bad (even worse
> than one the older box - the guest is consuming about 2.5 CPUs
> worth of processing power _without_ the patch here in use, so I
> don't even want to think about trying it there); I guessed that to
> be another of the potential sources of the problem since on that
> older box the respective hardware feature is unavailable.

That wasn't a fair comparison: The guest here had an SR-IOV NIC
assigned. Once removing that, badness goes back to about the
level I observe on the Westmere box. I'm then relatively certain
that this "extra" badness can be attributed to the excessive use of
wbinvd. Which of course puts under question what good assigning
a device does on a system without snoop control, if overall
performance gets worse rather than better.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "Xiantao Zhang" <xiantao.zhang@intel.com>
Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, DonaldD Dugger <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, George Dunlap <george.dunlap@eu.citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 18 Feb 2014 08:45:31 +0000
Message-ID: <53032BBB020000780011D214@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 18.02.14 at 04:25, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
> I tried to reproduce it. But unfortunately, I cannot reproduce it in my box 
> (sandy bridge EP)with latest Xen(include my patch). I guess my configuration 
> or steps may wrong, here is mine:
> 
> 1. add iommu=1,no-snoop in by xen cmd line:
> (XEN) Intel VT-d Snoop Control not enabled.
> (XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
> (XEN) Intel VT-d Queued Invalidation enabled.
> (XEN) Intel VT-d Interrupt Remapping enabled.
> (XEN) Intel VT-d Shared EPT tables enabled.
> 
> 2. boot a rhel6u4 guest.
> 
> 3. after guest boot up, run startx inside guest.
> 
> 4. a few second, the X windows shows and didn't see any error. Also the CPU 
> utilization is about 1.7%.
> 
> Any thing wrong?

Nothing I can see. The main difference might be that I'm using a
SLES11 SP3 guest instead of a RHEL one.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Jan Beulich <JBeulich@suse.com>, "Zhang, Yang Z" <yang.z.zhang@intel.com>, Tim Deegan <tim@xen.org>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Dugger, Donald D" <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 18 Feb 2014 10:26:12 +0000
Message-ID: <53033544.2000409@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 02/18/2014 03:14 AM, Zhang, Yang Z wrote:
> Jan Beulich wrote on 2014-02-17:
>>>>> On 17.02.14 at 15:51, George Dunlap <george.dunlap@eu.citrix.com>
>> wrote:
>>> On 02/17/2014 10:18 AM, Jan Beulich wrote:
>>>> Actually I'm afraid there are two problems with this patch:
>>>>
>>>> For one, is enabling "global" log dirty mode still going to work
>>>> after VRAM-only mode already got enabled? I ask because the
>>>> paging_mode_log_dirty() check which paging_log_dirty_enable() does
>>>> first thing suggests otherwise to me (i.e. the now conditional
>>>> setting of all p2m entries to p2m_ram_logdirty would seem to never
>>>> get executed). IOW I would think that we're now lacking a control
>>>> operation allowing the transition from dirty VRAM tracking mode to
>>>> full log dirty mode.
>>> Hrm, will so far playing with this I've been unable to get a
>>> localhost migrate to fail with the vncviewer attached.  Which seems a bit strange...
>> Not necessarily - it may depend on how the tools actually do this:
>> They might temporarily disable log dirty mode altogether, just to
>> re-enable full mode again right away. But this specific usage of the
>> hypervisor interface wouldn't (to me) mean that other tool stacks
>> might not be doing this differently.
> You are right. Before migration, libxc will disable log dirty mode if it already enabled before and re-enable it. So when I am developing this patch, I think it is ok for migration.
>
> If there really have other tool stacks also will use this interface (Is it true?), perhaps my original patch is better which will check paging_mode_log_dirty(d) && log_global:

It turns out that the reason I couldn't get a crash was because libxc 
was actually paying attention to the -EINVAL return value, and disabling 
and then re-enabling logdirty.  That's what would happen before your 
dirty vram patch, and that's what happens after.  And arguably, that's 
the correct behavior for any toolstack, given that the interface returns 
an error.

This patch would actually change the interface; if we check this in, 
then if you enable logdirty when dirty vram tracking is enabled, you 
*won't* get an error, and thus *won't* disable and re-enable logdirty 
mode.  So actually, this patch would be more disruptive.

Thanks for the patch, though. :-)

  -George

>
> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
> index ab5eacb..368c975 100644
> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -168,7 +168,7 @@ int paging_log_dirty_enable(struct domain *d, bool_t log_global)
>   {
>       int ret;
>   
> -    if ( paging_mode_log_dirty(d) )
> +    if ( paging_mode_log_dirty(d) && !log_global )
>           return -EINVAL;
>   
>       domain_pause(d);
>
>
>> Jan
>
> Best regards,
> Yang
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>, "Dongxiao Xu" <dongxiao.xu@intel.com>, "Xiantao Zhang" <xiantao.zhang@intel.com>
Cc: DonaldD Dugger <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 18 Feb 2014 11:46:32 +0000
Message-ID: <53035628020000780011D3EE@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 18.02.14 at 04:25, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
> Jan Beulich wrote on 2014-02-17:
>>>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote:
>>> And second, I have been fighting with finding both conditions and
>>> (eventually) the root cause of a severe performance regression
>>> (compared to 4.3.x) I'm observing on an EPT+IOMMU system. This
>>> became _much_ worse after adding in the patch here (while in fact I
>>> had hoped it might help with the originally observed
>>> degradation): X startup fails due to timing out, and booting the
>>> guest now takes about 20 minutes). I didn't find the root cause of
>>> this yet, but meanwhile I know that
>>> - the same isn't observable on SVM
>>> - there's no problem when forcing the domain to use shadow
>>>   mode - there's no need for any device to actually be assigned to the
>>>   guest - the regression is very likely purely graphics related (based
>>>   on the observation that when running something that regularly but not
>>>   heavily updates the screen with X up, the guest consumes a full CPU's
>>>   worth of processing power, yet when that updating doesn't happen, CPU
>>>   consumption goes down, and it goes further down when shutting down X
>>>   altogether - at least as log as the patch here doesn't get involved).
>>> This I'm observing on a Westmere box (and I didn't notice it earlier
>>> because that's one of those where due to a chipset erratum the IOMMU
>>> gets turned off by default), so it's possible that this can't be
>>> seen on more modern hardware. I'll hopefully find time today to
>>> check this on the one newer (Sandy Bridge) box I have.
>> 
>> Just got done with trying this: By default, things work fine there.
>> As soon as I use "iommu=no-snoop", things go bad (even worse than one
>> the older box - the guest is consuming about 2.5 CPUs worth of
>> processing power _without_ the patch here in use, so I don't even want
>> to think about trying it there); I guessed that to be another of the
>> potential sources of the problem since on that older box the respective 
> hardware feature is unavailable.
>> 
>> While I'll try to look into this further, I guess I have to defer to
>> our VT-d specialists at Intel at this point...
>> 
> 
> Hi, Jan,
> 
> I tried to reproduce it. But unfortunately, I cannot reproduce it in my box 
> (sandy bridge EP)with latest Xen(include my patch). I guess my configuration 
> or steps may wrong, here is mine:
> 
> 1. add iommu=1,no-snoop in by xen cmd line:
> (XEN) Intel VT-d Snoop Control not enabled.
> (XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
> (XEN) Intel VT-d Queued Invalidation enabled.
> (XEN) Intel VT-d Interrupt Remapping enabled.
> (XEN) Intel VT-d Shared EPT tables enabled.
> 
> 2. boot a rhel6u4 guest.
> 
> 3. after guest boot up, run startx inside guest.
> 
> 4. a few second, the X windows shows and didn't see any error. Also the CPU 
> utilization is about 1.7%.
> 
> Any thing wrong?

Nothing at all, as it turns out. The regression is due to Dongxiao's

http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.html

which I have in my tree as part of various things pending for 4.5.
And which at the first, second, and third glance looks pretty
innocent (IOW I still have to find out _why_ it is wrong).

In any case - I'm very sorry for the false alarm.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Dongxiao Xu" <dongxiao.xu@intel.com>, "Yang Z Zhang" <yang.z.zhang@intel.com>, "Xiantao Zhang" <xiantao.zhang@intel.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, George Dunlap <george.dunlap@eu.citrix.com>, DonaldD Dugger <donald.d.dugger@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 18 Feb 2014 15:28:35 +0000
Message-ID: <53038A33020000780011D5FD@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

[Part 1 (text/plain, inline)]
>>> On 18.02.14 at 12:46, "Jan Beulich" <JBeulich@suse.com> wrote:
> Nothing at all, as it turns out. The regression is due to Dongxiao's
> 
> http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.html 
> 
> which I have in my tree as part of various things pending for 4.5.
> And which at the first, second, and third glance looks pretty
> innocent (IOW I still have to find out _why_ it is wrong).

And here's a fixed version of the patch - we simply can't drop
the bogus HVM_PARAM_IDENT_PT check entirely yet.

In the course of fixing this I also found two other shortcomings:
- EPT EMT field should be updated upon guest MTRR writes (the
  lack thereof is the reason for needing to retain the bogus check)
- epte_get_entry_emt() either needs "order" passed, or its callers
  must call it more than once for big/huge pages

Jan

x86/hvm: refine the judgment on IDENT_PT for EMT

When trying to get the EPT EMT type, the judgment on
HVM_PARAM_IDENT_PT is not correct which always returns WB type if
the parameter is not set. Remove the related code.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>

We can't fully drop the dependency yet, but we should certainly avoid
overriding cases already properly handled. The reason for this is that
the guest setting up its MTRRs happens _after_ the EPT tables got
already constructed, and no code is in place to propagate this to the
EPT code. Without this check we're forcing the guest to run with all of
its memory uncachable until something happens to re-write every single
EPT entry. But of course this has to be just a temporary solution.

In the same spirit we should defer the "very early" (when the guest is
still being constructed and has no vCPU yet) override to the last
possible point.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -689,13 +689,8 @@ uint8_t epte_get_entry_emt(struct domain
 
     *ipat = 0;
 
-    if ( (current->domain != d) &&
-         ((d->vcpu == NULL) || ((v = d->vcpu[0]) == NULL)) )
-        return MTRR_TYPE_WRBACK;
-
-    if ( !is_pvh_vcpu(v) &&
-         !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] )
-        return MTRR_TYPE_WRBACK;
+    if ( v->domain != d )
+        v = d->vcpu ? d->vcpu[0] : NULL;
 
     if ( !mfn_valid(mfn_x(mfn)) )
         return MTRR_TYPE_UNCACHABLE;
@@ -718,7 +713,8 @@ uint8_t epte_get_entry_emt(struct domain
         return MTRR_TYPE_WRBACK;
     }
 
-    gmtrr_mtype = is_hvm_vcpu(v) ?
+    gmtrr_mtype = is_hvm_domain(d) && v &&
+                  d->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ?
                   get_mtrr_type(&v->arch.hvm_vcpu.mtrr, (gfn << PAGE_SHIFT)) :
                   MTRR_TYPE_WRBACK;
[EPT-ignore-IDENT_PT (application/octet-stream, attachment)]
[Part 3 (text/plain, inline)]

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Jan Beulich <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Xu, Dongxiao" <dongxiao.xu@intel.com>, George Dunlap <george.dunlap@eu.citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Dugger, Donald D" <donald.d.dugger@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 19 Feb 2014 01:17:08 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9EEB9B@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Jan Beulich wrote on 2014-02-18:
>>>> On 18.02.14 at 04:25, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
>> Jan Beulich wrote on 2014-02-17:
>>>>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>> And second, I have been fighting with finding both conditions and
>>>> (eventually) the root cause of a severe performance regression
>>>> (compared to 4.3.x) I'm observing on an EPT+IOMMU system. This
>>>> became _much_ worse after adding in the patch here (while in fact
>>>> I had hoped it might help with the originally observed
>>>> degradation): X startup fails due to timing out, and booting the
>>>> guest now takes about 20 minutes). I didn't find the root cause of
>>>> this yet, but meanwhile I know that
>>>> - the same isn't observable on SVM
>>>> - there's no problem when forcing the domain to use shadow
>>>>   mode - there's no need for any device to actually be assigned to the
>>>>   guest - the regression is very likely purely graphics related (based
>>>>   on the observation that when running something that regularly but not
>>>>   heavily updates the screen with X up, the guest consumes a full CPU's
>>>>   worth of processing power, yet when that updating doesn't
>>>> happen,
> CPU
>>>>   consumption goes down, and it goes further down when shutting
>>>> down
> X
>>>>   altogether - at least as log as the patch here doesn't get involved).
>>>> This I'm observing on a Westmere box (and I didn't notice it
>>>> earlier because that's one of those where due to a chipset erratum
>>>> the IOMMU gets turned off by default), so it's possible that this
>>>> can't be seen on more modern hardware. I'll hopefully find time
>>>> today to check this on the one newer (Sandy Bridge) box I have.
>>> 
>>> Just got done with trying this: By default, things work fine there. As
>>> soon as I use "iommu=no-snoop", things go bad (even worse than one the
>>> older box - the guest is consuming about 2.5 CPUs worth of processing
>>> power _without_ the patch here in use, so I don't even want to think
>>> about trying it there); I guessed that to be another of the potential
>>> sources of the problem since on that older box the respective hardware
>>> feature is unavailable.
>>> 
>>> While I'll try to look into this further, I guess I have to defer
>>> to our VT-d specialists at Intel at this point...
>>> 
>> 
>> Hi, Jan,
>> 
>> I tried to reproduce it. But unfortunately, I cannot reproduce it in
>> my box (sandy bridge EP)with latest Xen(include my patch). I guess
>> my configuration or steps may wrong, here is mine:
>> 
>> 1. add iommu=1,no-snoop in by xen cmd line:
>> (XEN) Intel VT-d Snoop Control not enabled.
>> (XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
>> (XEN) Intel VT-d Queued Invalidation enabled.
>> (XEN) Intel VT-d Interrupt Remapping enabled.
>> (XEN) Intel VT-d Shared EPT tables enabled.
>> 
>> 2. boot a rhel6u4 guest.
>> 
>> 3. after guest boot up, run startx inside guest.
>> 
>> 4. a few second, the X windows shows and didn't see any error. Also
>> the CPU utilization is about 1.7%.
>> 
>> Any thing wrong?
> 
> Nothing at all, as it turns out. The regression is due to Dongxiao's
> 
> http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.h
> tml
> 
> which I have in my tree as part of various things pending for 4.5.
> And which at the first, second, and third glance looks pretty innocent
> (IOW I still have to find out _why_ it is wrong).
> 
> In any case - I'm very sorry for the false alarm.
> 

It doesn't matter. Conversely, we need to thank you for helping us to fix this issue. :)

BTW, I still cannot reproduce it in my box, even I uses SLES 11 SP3 as guest.

> Jan


Best regards,
Yang


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Tim Deegan <tim@xen.org>, Jan Beulich <JBeulich@suse.com>, George Dunlap <george.dunlap@eu.citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Dugger, Donald D" <donald.d.dugger@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 19 Feb 2014 01:28:59 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0A9EEBC9@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

George Dunlap wrote on 2014-02-18:
> On 02/18/2014 03:14 AM, Zhang, Yang Z wrote:
>> Jan Beulich wrote on 2014-02-17:
>>>>>> On 17.02.14 at 15:51, George Dunlap
>>>>>> <george.dunlap@eu.citrix.com>
>>> wrote:
>>>> On 02/17/2014 10:18 AM, Jan Beulich wrote:
>>>>> Actually I'm afraid there are two problems with this patch:
>>>>> 
>>>>> For one, is enabling "global" log dirty mode still going to work
>>>>> after VRAM-only mode already got enabled? I ask because the
>>>>> paging_mode_log_dirty() check which paging_log_dirty_enable()
>>>>> does first thing suggests otherwise to me (i.e. the now
>>>>> conditional setting of all p2m entries to p2m_ram_logdirty would
>>>>> seem to never get executed). IOW I would think that we're now
>>>>> lacking a control operation allowing the transition from dirty
>>>>> VRAM tracking mode to full log dirty mode.
>>>> Hrm, will so far playing with this I've been unable to get a
>>>> localhost migrate to fail with the vncviewer attached.  Which
>>>> seems a bit
> strange...
>>> Not necessarily - it may depend on how the tools actually do this:
>>> They might temporarily disable log dirty mode altogether, just to
>>> re-enable full mode again right away. But this specific usage of
>>> the hypervisor interface wouldn't (to me) mean that other tool
>>> stacks might not be doing this differently.
>> You are right. Before migration, libxc will disable log dirty mode
>> if it already
> enabled before and re-enable it. So when I am developing this patch, I
> think it is ok for migration.
>> 
>> If there really have other tool stacks also will use this interface
>> (Is it true?),
> perhaps my original patch is better which will check
> paging_mode_log_dirty(d) && log_global:
> 
> It turns out that the reason I couldn't get a crash was because libxc
> was actually paying attention to the -EINVAL return value, and
> disabling and then re-enabling logdirty.  That's what would happen
> before your dirty vram patch, and that's what happens after.  And
> arguably, that's the correct behavior for any toolstack, given that the interface returns an error.

Agree.

> 
> This patch would actually change the interface; if we check this in,
> then if you enable logdirty when dirty vram tracking is enabled, you
> *won't* get an error, and thus *won't* disable and re-enable logdirty mode.
> So actually, this patch would be more disruptive.
> 

Jan, do you have any comment? 

> Thanks for the patch, though. :-)
> 
>   -George
>> 
>> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
>> index
>> ab5eacb..368c975 100644
>> --- a/xen/arch/x86/mm/paging.c
>> +++ b/xen/arch/x86/mm/paging.c
>> @@ -168,7 +168,7 @@ int paging_log_dirty_enable(struct domain *d,
> bool_t log_global)
>>   {
>>       int ret;
>> -    if ( paging_mode_log_dirty(d) )
>> +    if ( paging_mode_log_dirty(d) && !log_global )
>>           return -EINVAL;
>>       domain_pause(d);
>> 
>>> Jan
>> 
>> Best regards,
>> Yang
>> 
>>


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Xu, Dongxiao" <dongxiao.xu@intel.com>
To: "Zhang, Xiantao" <xiantao.zhang@intel.com>, Jan Beulich <JBeulich@suse.com>, "Zhang, Yang Z" <yang.z.zhang@intel.com>
Cc: "Dugger, Donald D" <donald.d.dugger@intel.com>, George Dunlap <george.dunlap@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 19 Feb 2014 06:40:54 +0000
Message-ID: <40776A41FC278F40B59438AD47D147A91194BB6B@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, February 18, 2014 11:29 PM
> To: Xu, Dongxiao; Zhang, Xiantao; Zhang, Yang Z
> Cc: andrew.cooper3@citrix.com; George Dunlap; Dugger, Donald D;
> xen-devel@lists.xen.org; Tim Deegan
> Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty
> to track vram
> 
> >>> On 18.02.14 at 12:46, "Jan Beulich" <JBeulich@suse.com> wrote:
> > Nothing at all, as it turns out. The regression is due to Dongxiao's
> >
> > http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.html
> >
> > which I have in my tree as part of various things pending for 4.5.
> > And which at the first, second, and third glance looks pretty
> > innocent (IOW I still have to find out _why_ it is wrong).
> 
> And here's a fixed version of the patch - we simply can't drop
> the bogus HVM_PARAM_IDENT_PT check entirely yet.
> 
> In the course of fixing this I also found two other shortcomings:
> - EPT EMT field should be updated upon guest MTRR writes (the
>   lack thereof is the reason for needing to retain the bogus check)
> - epte_get_entry_emt() either needs "order" passed, or its callers
>   must call it more than once for big/huge pages
> 

The below patch looks okay to me. Thanks!

Thanks,
Dongxiao

> Jan
> 
> x86/hvm: refine the judgment on IDENT_PT for EMT
> 
> When trying to get the EPT EMT type, the judgment on
> HVM_PARAM_IDENT_PT is not correct which always returns WB type if
> the parameter is not set. Remove the related code.
> 
> Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
> 
> We can't fully drop the dependency yet, but we should certainly avoid
> overriding cases already properly handled. The reason for this is that
> the guest setting up its MTRRs happens _after_ the EPT tables got
> already constructed, and no code is in place to propagate this to the
> EPT code. Without this check we're forcing the guest to run with all of
> its memory uncachable until something happens to re-write every single
> EPT entry. But of course this has to be just a temporary solution.
> 
> In the same spirit we should defer the "very early" (when the guest is
> still being constructed and has no vCPU yet) override to the last
> possible point.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -689,13 +689,8 @@ uint8_t epte_get_entry_emt(struct domain
> 
>      *ipat = 0;
> 
> -    if ( (current->domain != d) &&
> -         ((d->vcpu == NULL) || ((v = d->vcpu[0]) == NULL)) )
> -        return MTRR_TYPE_WRBACK;
> -
> -    if ( !is_pvh_vcpu(v) &&
> -         !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] )
> -        return MTRR_TYPE_WRBACK;
> +    if ( v->domain != d )
> +        v = d->vcpu ? d->vcpu[0] : NULL;
> 
>      if ( !mfn_valid(mfn_x(mfn)) )
>          return MTRR_TYPE_UNCACHABLE;
> @@ -718,7 +713,8 @@ uint8_t epte_get_entry_emt(struct domain
>          return MTRR_TYPE_WRBACK;
>      }
> 
> -    gmtrr_mtype = is_hvm_vcpu(v) ?
> +    gmtrr_mtype = is_hvm_domain(d) && v &&
> +                  d->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ?
>                    get_mtrr_type(&v->arch.hvm_vcpu.mtrr, (gfn <<
> PAGE_SHIFT)) :
>                    MTRR_TYPE_WRBACK;
> 
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: Dongxiao Xu <dongxiao.xu@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, George Dunlap <george.dunlap@eu.citrix.com>, DonaldD Dugger <donald.d.dugger@intel.com>, Xiantao Zhang <xiantao.zhang@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 19 Feb 2014 08:50:55 +0000
Message-ID: <53047E7F020000780011D8E1@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 19.02.14 at 02:17, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
> Jan Beulich wrote on 2014-02-18:
>>>>> On 18.02.14 at 04:25, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
>>> Jan Beulich wrote on 2014-02-17:
>>>>>>> On 17.02.14 at 11:18, "Jan Beulich" <JBeulich@suse.com> wrote:
>>>>> And second, I have been fighting with finding both conditions and
>>>>> (eventually) the root cause of a severe performance regression
>>>>> (compared to 4.3.x) I'm observing on an EPT+IOMMU system. This
>>>>> became _much_ worse after adding in the patch here (while in fact
>>>>> I had hoped it might help with the originally observed
>>>>> degradation): X startup fails due to timing out, and booting the
>>>>> guest now takes about 20 minutes). I didn't find the root cause of
>>>>> this yet, but meanwhile I know that
>>>>> - the same isn't observable on SVM
>>>>> - there's no problem when forcing the domain to use shadow
>>>>>   mode - there's no need for any device to actually be assigned to the
>>>>>   guest - the regression is very likely purely graphics related (based
>>>>>   on the observation that when running something that regularly but not
>>>>>   heavily updates the screen with X up, the guest consumes a full CPU's
>>>>>   worth of processing power, yet when that updating doesn't
>>>>> happen,
>> CPU
>>>>>   consumption goes down, and it goes further down when shutting
>>>>> down
>> X
>>>>>   altogether - at least as log as the patch here doesn't get involved).
>>>>> This I'm observing on a Westmere box (and I didn't notice it
>>>>> earlier because that's one of those where due to a chipset erratum
>>>>> the IOMMU gets turned off by default), so it's possible that this
>>>>> can't be seen on more modern hardware. I'll hopefully find time
>>>>> today to check this on the one newer (Sandy Bridge) box I have.
>>>> 
>>>> Just got done with trying this: By default, things work fine there. As
>>>> soon as I use "iommu=no-snoop", things go bad (even worse than one the
>>>> older box - the guest is consuming about 2.5 CPUs worth of processing
>>>> power _without_ the patch here in use, so I don't even want to think
>>>> about trying it there); I guessed that to be another of the potential
>>>> sources of the problem since on that older box the respective hardware
>>>> feature is unavailable.
>>>> 
>>>> While I'll try to look into this further, I guess I have to defer
>>>> to our VT-d specialists at Intel at this point...
>>>> 
>>> 
>>> Hi, Jan,
>>> 
>>> I tried to reproduce it. But unfortunately, I cannot reproduce it in
>>> my box (sandy bridge EP)with latest Xen(include my patch). I guess
>>> my configuration or steps may wrong, here is mine:
>>> 
>>> 1. add iommu=1,no-snoop in by xen cmd line:
>>> (XEN) Intel VT-d Snoop Control not enabled.
>>> (XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
>>> (XEN) Intel VT-d Queued Invalidation enabled.
>>> (XEN) Intel VT-d Interrupt Remapping enabled.
>>> (XEN) Intel VT-d Shared EPT tables enabled.
>>> 
>>> 2. boot a rhel6u4 guest.
>>> 
>>> 3. after guest boot up, run startx inside guest.
>>> 
>>> 4. a few second, the X windows shows and didn't see any error. Also
>>> the CPU utilization is about 1.7%.
>>> 
>>> Any thing wrong?
>> 
>> Nothing at all, as it turns out. The regression is due to Dongxiao's
>> 
>> http://lists.xenproject.org/archives/html/xen-devel/2013-12/msg00367.h 
>> tml
>> 
>> which I have in my tree as part of various things pending for 4.5.
>> And which at the first, second, and third glance looks pretty innocent
>> (IOW I still have to find out _why_ it is wrong).
>> 
>> In any case - I'm very sorry for the false alarm.
>> 
> 
> It doesn't matter. Conversely, we need to thank you for helping us to fix 
> this issue. :)
> 
> BTW, I still cannot reproduce it in my box, even I uses SLES 11 SP3 as 
> guest.

I assume you didn't pull in the broken patch - I'm sure you would
see the problem if you did.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Donald D Dugger <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, George Dunlap <george.dunlap@eu.citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 19 Feb 2014 08:55:00 +0000
Message-ID: <53047F74020000780011D8E4@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 19.02.14 at 02:28, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
> George Dunlap wrote on 2014-02-18:
>> On 02/18/2014 03:14 AM, Zhang, Yang Z wrote:
>> perhaps my original patch is better which will check
>> paging_mode_log_dirty(d) && log_global:
>> 
>> It turns out that the reason I couldn't get a crash was because libxc
>> was actually paying attention to the -EINVAL return value, and
>> disabling and then re-enabling logdirty.  That's what would happen
>> before your dirty vram patch, and that's what happens after.  And
>> arguably, that's the correct behavior for any toolstack, given that the 
> interface returns an error.
> 
> Agree.
> 
>> 
>> This patch would actually change the interface; if we check this in,
>> then if you enable logdirty when dirty vram tracking is enabled, you
>> *won't* get an error, and thus *won't* disable and re-enable logdirty mode.
>> So actually, this patch would be more disruptive.
>> 
> 
> Jan, do you have any comment? 

This simplistic variant is just calling for problems. As was already
said elsewhere on this thread, we should simply do the mode change
properly: Track that a partial log-dirty mode is in use, and allow
switching to global log-dirty mode (converting all entries to R/O).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Yang Z Zhang <yang.z.zhang@intel.com>, Jan Beulich <JBeulich@suse.com>
Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Donald D Dugger <donald.d.dugger@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 19 Feb 2014 11:03:50 +0000
Message-ID: <53048F96.7010503@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 02/19/2014 08:55 AM, Jan Beulich wrote:
>>>> On 19.02.14 at 02:28, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
>> George Dunlap wrote on 2014-02-18:
>>> On 02/18/2014 03:14 AM, Zhang, Yang Z wrote:
>>> perhaps my original patch is better which will check
>>> paging_mode_log_dirty(d) && log_global:
>>>
>>> It turns out that the reason I couldn't get a crash was because libxc
>>> was actually paying attention to the -EINVAL return value, and
>>> disabling and then re-enabling logdirty.  That's what would happen
>>> before your dirty vram patch, and that's what happens after.  And
>>> arguably, that's the correct behavior for any toolstack, given that the
>> interface returns an error.
>>
>> Agree.
>>
>>> This patch would actually change the interface; if we check this in,
>>> then if you enable logdirty when dirty vram tracking is enabled, you
>>> *won't* get an error, and thus *won't* disable and re-enable logdirty mode.
>>> So actually, this patch would be more disruptive.
>>>
>> Jan, do you have any comment?
> This simplistic variant is just calling for problems. As was already
> said elsewhere on this thread, we should simply do the mode change
> properly: Track that a partial log-dirty mode is in use, and allow
> switching to global log-dirty mode (converting all entries to R/O).

I think Yang was asking you for your opinion on my suggestion that 
nothing actually needed to be done.  Enabling full logdirty mode for 
migration when dirty vram tracking was enabled has *always* returned an 
error (or at least for a long time now), and *always* resulted in the 
toolstack disabling and re-enabling logdirty mode; Yang's patch doesn't 
change that at all.

If you think that's an interface we need to improve in the future, we 
can put it on the list of improvements.  But at this point it seems to 
me more like a nice-to-have.

  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <george.dunlap@eu.citrix.com>
Cc: Donald D Dugger <donald.d.dugger@intel.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 19 Feb 2014 11:13:07 +0000
Message-ID: <53049FD3020000780011DA37@nat28.tlf.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 19.02.14 at 12:03, George Dunlap <george.dunlap@eu.citrix.com> wrote:
> On 02/19/2014 08:55 AM, Jan Beulich wrote:
>>>>> On 19.02.14 at 02:28, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
>>> George Dunlap wrote on 2014-02-18:
>>>> On 02/18/2014 03:14 AM, Zhang, Yang Z wrote:
>>>> perhaps my original patch is better which will check
>>>> paging_mode_log_dirty(d) && log_global:
>>>>
>>>> It turns out that the reason I couldn't get a crash was because libxc
>>>> was actually paying attention to the -EINVAL return value, and
>>>> disabling and then re-enabling logdirty.  That's what would happen
>>>> before your dirty vram patch, and that's what happens after.  And
>>>> arguably, that's the correct behavior for any toolstack, given that the
>>> interface returns an error.
>>>
>>> Agree.
>>>
>>>> This patch would actually change the interface; if we check this in,
>>>> then if you enable logdirty when dirty vram tracking is enabled, you
>>>> *won't* get an error, and thus *won't* disable and re-enable logdirty mode.
>>>> So actually, this patch would be more disruptive.
>>>>
>>> Jan, do you have any comment?
>> This simplistic variant is just calling for problems. As was already
>> said elsewhere on this thread, we should simply do the mode change
>> properly: Track that a partial log-dirty mode is in use, and allow
>> switching to global log-dirty mode (converting all entries to R/O).
> 
> I think Yang was asking you for your opinion on my suggestion that 
> nothing actually needed to be done.  Enabling full logdirty mode for 
> migration when dirty vram tracking was enabled has *always* returned an 
> error (or at least for a long time now), and *always* resulted in the 
> toolstack disabling and re-enabling logdirty mode; Yang's patch doesn't 
> change that at all.
> 
> If you think that's an interface we need to improve in the future, we 
> can put it on the list of improvements.  But at this point it seems to 
> me more like a nice-to-have.

I agree - for 4.4.0 we shouldn't need any further adjustments. And
I hoped to imply that I don't see a need for this incremental change
to go in by having said "This simplistic variant is just calling for
problems".

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Jan Beulich <JBeulich@suse.com>, Yang Z Zhang <yang.z.zhang@intel.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Xiantao Zhang <xiantao.zhang@intel.com>, Donald D Dugger <donald.d.dugger@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 19 Feb 2014 11:17:43 +0000
Message-ID: <530492D7.6050503@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 02/19/2014 11:13 AM, Jan Beulich wrote:
>>>> On 19.02.14 at 12:03, George Dunlap <george.dunlap@eu.citrix.com> wrote:
>> On 02/19/2014 08:55 AM, Jan Beulich wrote:
>>>>>> On 19.02.14 at 02:28, "Zhang, Yang Z" <yang.z.zhang@intel.com> wrote:
>>>> George Dunlap wrote on 2014-02-18:
>>>>> On 02/18/2014 03:14 AM, Zhang, Yang Z wrote:
>>>>> perhaps my original patch is better which will check
>>>>> paging_mode_log_dirty(d) && log_global:
>>>>>
>>>>> It turns out that the reason I couldn't get a crash was because libxc
>>>>> was actually paying attention to the -EINVAL return value, and
>>>>> disabling and then re-enabling logdirty.  That's what would happen
>>>>> before your dirty vram patch, and that's what happens after.  And
>>>>> arguably, that's the correct behavior for any toolstack, given that the
>>>> interface returns an error.
>>>>
>>>> Agree.
>>>>
>>>>> This patch would actually change the interface; if we check this in,
>>>>> then if you enable logdirty when dirty vram tracking is enabled, you
>>>>> *won't* get an error, and thus *won't* disable and re-enable logdirty mode.
>>>>> So actually, this patch would be more disruptive.
>>>>>
>>>> Jan, do you have any comment?
>>> This simplistic variant is just calling for problems. As was already
>>> said elsewhere on this thread, we should simply do the mode change
>>> properly: Track that a partial log-dirty mode is in use, and allow
>>> switching to global log-dirty mode (converting all entries to R/O).
>> I think Yang was asking you for your opinion on my suggestion that
>> nothing actually needed to be done.  Enabling full logdirty mode for
>> migration when dirty vram tracking was enabled has *always* returned an
>> error (or at least for a long time now), and *always* resulted in the
>> toolstack disabling and re-enabling logdirty mode; Yang's patch doesn't
>> change that at all.
>>
>> If you think that's an interface we need to improve in the future, we
>> can put it on the list of improvements.  But at this point it seems to
>> me more like a nice-to-have.
> I agree - for 4.4.0 we shouldn't need any further adjustments. And
> I hoped to imply that I don't see a need for this incremental change
> to go in by having said "This simplistic variant is just calling for
> problems".

No, but "we should simply do the mode change properly" could be 
interpreted as saying, "this needs to be done as a follow-up to the 
dirty vram tracking patch"; someone might even interpret it as, "you 
need to do this as a follow-up".  That's what I was trying to clarify / 
express an opinion on. :-)

  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Tim Deegan <tim@xen.org>, Jan Beulich <JBeulich@suse.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, George Dunlap <george.dunlap@eu.citrix.com>, "'Keir Fraser \(keir.xen@gmail.com\)'" <keir.xen@gmail.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 19 May 2014 07:48:01 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACCE42@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Tim Deegan wrote on 2014-02-14:
> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote:
>>>>> On 13.02.14 at 16:46, George Dunlap
>>>>> <george.dunlap@eu.citrix.com>
> wrote:
>>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote:
>>>> George Dunlap wrote on 2014-02-11:
>>>>> I think I got a bit distracted with the "A isn't really so bad" thing.
>>>>> Actually, if the overhead of not sharing tables isn't very high,
>>>>> then B isn't such a bad option.  In fact, B is what I expected
>>>>> Yang to submit when he originally described the problem.
>>>> Actually, the first solution came to my mind is B. Then I
>>>> realized that even
>>> chose B, we still cannot track the memory updating from DMA(even with
>>> A/D bit, it still a problem). Also, considering the current usage case
>>> of log dirty in Xen(only vram tracking has problem), I though A is
>>> better.: Hypervisor only need to track the vram change. If a malicious
>>> guest try to DMA to vram range, it only crashed himself (This should be
> reasonable).
>>>> 
>>>>> I was going to say, from a release perspective, B is probably
>>>>> the safest option for now.  But on the other hand, if we've been
>>>>> testing sharing all this time, maybe switching back over to
>>>>> non-sharing whole-hog has
>>> the higher risk?
>>>> Another problem with B is that current VT-d large paging
>>>> supporting relies on
>>> the sharing EPT and VT-d page table. This means if we choose B,
>>> then we need to re-enable VT-d large page. This would be a huge
>>> performance impaction for Xen 4.4 on using VT-d solution.
>>> 
>>> OK -- if that's the case, then it definitely tips the balance back
>>> to A.  Unless Tim or Jan disagrees, can one of you two check it in?
>>> 
>>> Don't rush your judgement; but it would be nice to have this in
>>> before RC4, which would mean checking it in today preferrably, or
>>> early tomorrow at the latest.
>> 
>> That would be Tim then, as he would have to approve of it anyway.
> 
> Done.
> 
>> I should also say that while I certainly understand the
>> argumentation above, I would still want to go this route only with
>> the promise that B is going to be worked on reasonably soon after
>> the release, ideally with the goal of backporting the changes for 4.4.1.
> 
> Agreed.
> 
> Tim.

Hi all

Sorry to turn out this old thread. 
Because I just noticed that someone is asking when Intel will implement the VT-d page table separately. Actually, I am totally unaware it. The original issue that this patch tries to fix is the VRAM tracking which using the global log dirty mode. And I thought the best solution to fix it is in VRAM side not VT-d side. Because even use separate VT-d page table, we still cannot track the memory update from DMA. Even worse, I think two page tables introduce redundant code and maintain effort. So I wonder is it really necessary to implement the separate VT-d large page?


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>, Keir Fraser <keir@xen.org>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <george.dunlap@eu.citrix.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 19 May 2014 10:03:35 +0100
Message-ID: <5379E50702000078000136F6@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 19.05.14 at 09:48, <yang.z.zhang@intel.com> wrote:
> Because I just noticed that someone is asking when Intel will implement the 
> VT-d page table separately. Actually, I am totally unaware it.

This was a request sent directly to you, so it shouldn't be a surprise.

> The original 
> issue that this patch tries to fix is the VRAM tracking which using the 
> global log dirty mode. And I thought the best solution to fix it is in VRAM 
> side not VT-d side. Because even use separate VT-d page table, we still cannot 
> track the memory update from DMA.

Correct. But at least we can avoid IOMMU faults by not marking read-
only the VRAM region. Unless the guest stores data in the VRAM region
that's not directly displayed information, but needs to be preserved
across migration, the only downside of this would be temporary screen
corruption in the guest immediately following migration. Clearly far
better than eventually turning off passed through devices due to
excessive IOMMU faults.

> Even worse, I think two page tables 
> introduce redundant code and maintain effort. So I wonder is it really 
> necessary to implement the separate VT-d large page?

Tim and I certainly think so. Andrew had a valid point in stating that
guests without the need for VRAM dirty tracking, but with assigned
PCI device(s) could still benefit from sharing page tables, so working
towards a model where this could be retained would clearly be the
optimal solution.

I wonder whether you actually too the time to go through the old
thread before writing your mail, since all you did is repeat your old
arguments, without addressing any of the reasons given why we
consider separate page tables better than shared ones.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <George.Dunlap@eu.citrix.com>
To: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jan Beulich <JBeulich@suse.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Keir Fraser \(keir.xen@gmail.com\)" <keir.xen@gmail.com>, "Nakajima, Jun" <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 19 May 2014 14:27:59 +0100
Message-ID: <CAFLBxZbn3X3EPjMyHqhNi4m18gYcTWnXK+Vwb3nC7PZUiXD2uw@mail.gmail.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On Mon, May 19, 2014 at 8:48 AM, Zhang, Yang Z <yang.z.zhang@intel.com> wrote:
> Tim Deegan wrote on 2014-02-14:
>> At 15:55 +0000 on 13 Feb (1392303343), Jan Beulich wrote:
>>>>>> On 13.02.14 at 16:46, George Dunlap
>>>>>> <george.dunlap@eu.citrix.com>
>> wrote:
>>>> On 02/12/2014 12:53 AM, Zhang, Yang Z wrote:
>>>>> George Dunlap wrote on 2014-02-11:
>>>>>> I think I got a bit distracted with the "A isn't really so bad" thing.
>>>>>> Actually, if the overhead of not sharing tables isn't very high,
>>>>>> then B isn't such a bad option.  In fact, B is what I expected
>>>>>> Yang to submit when he originally described the problem.
>>>>> Actually, the first solution came to my mind is B. Then I
>>>>> realized that even
>>>> chose B, we still cannot track the memory updating from DMA(even with
>>>> A/D bit, it still a problem). Also, considering the current usage case
>>>> of log dirty in Xen(only vram tracking has problem), I though A is
>>>> better.: Hypervisor only need to track the vram change. If a malicious
>>>> guest try to DMA to vram range, it only crashed himself (This should be
>> reasonable).
>>>>>
>>>>>> I was going to say, from a release perspective, B is probably
>>>>>> the safest option for now.  But on the other hand, if we've been
>>>>>> testing sharing all this time, maybe switching back over to
>>>>>> non-sharing whole-hog has
>>>> the higher risk?
>>>>> Another problem with B is that current VT-d large paging
>>>>> supporting relies on
>>>> the sharing EPT and VT-d page table. This means if we choose B,
>>>> then we need to re-enable VT-d large page. This would be a huge
>>>> performance impaction for Xen 4.4 on using VT-d solution.
>>>>
>>>> OK -- if that's the case, then it definitely tips the balance back
>>>> to A.  Unless Tim or Jan disagrees, can one of you two check it in?
>>>>
>>>> Don't rush your judgement; but it would be nice to have this in
>>>> before RC4, which would mean checking it in today preferrably, or
>>>> early tomorrow at the latest.
>>>
>>> That would be Tim then, as he would have to approve of it anyway.
>>
>> Done.
>>
>>> I should also say that while I certainly understand the
>>> argumentation above, I would still want to go this route only with
>>> the promise that B is going to be worked on reasonably soon after
>>> the release, ideally with the goal of backporting the changes for 4.4.1.
>>
>> Agreed.
>>
>> Tim.
>
> Hi all
>
> Sorry to turn out this old thread.
> Because I just noticed that someone is asking when Intel will implement the VT-d page table separately. Actually, I am totally unaware it. The original issue that this patch tries to fix is the VRAM tracking which using the global log dirty mode. And I thought the best solution to fix it is in VRAM side not VT-d side. Because even use separate VT-d page table, we still cannot track the memory update from DMA. Even worse, I think two page tables introduce redundant code and maintain effort. So I wonder is it really necessary to implement the separate VT-d large page?

Yes, it does introduce redundant code.  But unfortunately, IOMMU
faults at the moment have to be considered rather risky; having on
happens risks (in order of decreasing probability / increasing
damage):
* Device stops working for that VM until an FLR (losing a lot of its state)
* The VM has to be killed
* The device stops working until a host reboot
* The host crashes

Avoiding these by "hoping" that the guest OS doesn't DMA into a video
buffer isn't really robust enough.  I think that was Tim and Jan's
primary reason for wanting the ability to have separate tables for HAP
and IOMMU.

Is that about right, Jan / Tim?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>, "George Dunlap" <George.Dunlap@eu.citrix.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "Keir Fraser \(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 19 May 2014 14:50:39 +0100
Message-ID: <537A284F0200007800013ADC@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 19.05.14 at 15:27, <George.Dunlap@eu.citrix.com> wrote:
> On Mon, May 19, 2014 at 8:48 AM, Zhang, Yang Z <yang.z.zhang@intel.com> wrote:
>> Because I just noticed that someone is asking when Intel will implement the 
> VT-d page table separately. Actually, I am totally unaware it. The original 
> issue that this patch tries to fix is the VRAM tracking which using the 
> global log dirty mode. And I thought the best solution to fix it is in VRAM 
> side not VT-d side. Because even use separate VT-d page table, we still 
> cannot track the memory update from DMA. Even worse, I think two page tables 
> introduce redundant code and maintain effort. So I wonder is it really 
> necessary to implement the separate VT-d large page?
> 
> Yes, it does introduce redundant code.  But unfortunately, IOMMU
> faults at the moment have to be considered rather risky; having on
> happens risks (in order of decreasing probability / increasing
> damage):
> * Device stops working for that VM until an FLR (losing a lot of its state)
> * The VM has to be killed
> * The device stops working until a host reboot
> * The host crashes
> 
> Avoiding these by "hoping" that the guest OS doesn't DMA into a video
> buffer isn't really robust enough.  I think that was Tim and Jan's
> primary reason for wanting the ability to have separate tables for HAP
> and IOMMU.
> 
> Is that about right, Jan / Tim?

Yes, and not just "about" (perhaps with the exception that I think/
hope we don't have any lurking host crashes here).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Jan Beulich <JBeulich@suse.com>, Yang Z Zhang <yang.z.zhang@intel.com>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>, "Keir Fraser \(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 19 May 2014 14:59:49 +0100
Message-ID: <537A0E55.502@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 05/19/2014 02:50 PM, Jan Beulich wrote:
>>>> On 19.05.14 at 15:27, <George.Dunlap@eu.citrix.com> wrote:
>> On Mon, May 19, 2014 at 8:48 AM, Zhang, Yang Z <yang.z.zhang@intel.com> wrote:
>>> Because I just noticed that someone is asking when Intel will implement the
>> VT-d page table separately. Actually, I am totally unaware it. The original
>> issue that this patch tries to fix is the VRAM tracking which using the
>> global log dirty mode. And I thought the best solution to fix it is in VRAM
>> side not VT-d side. Because even use separate VT-d page table, we still
>> cannot track the memory update from DMA. Even worse, I think two page tables
>> introduce redundant code and maintain effort. So I wonder is it really
>> necessary to implement the separate VT-d large page?
>>
>> Yes, it does introduce redundant code.  But unfortunately, IOMMU
>> faults at the moment have to be considered rather risky; having on
>> happens risks (in order of decreasing probability / increasing
>> damage):
>> * Device stops working for that VM until an FLR (losing a lot of its state)
>> * The VM has to be killed
>> * The device stops working until a host reboot
>> * The host crashes
>>
>> Avoiding these by "hoping" that the guest OS doesn't DMA into a video
>> buffer isn't really robust enough.  I think that was Tim and Jan's
>> primary reason for wanting the ability to have separate tables for HAP
>> and IOMMU.
>>
>> Is that about right, Jan / Tim?
> Yes, and not just "about" (perhaps with the exception that I think/
> hope we don't have any lurking host crashes here).

I think the fear was that buggy hardware might cause a host crash / hang.

  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "George Dunlap" <george.dunlap@eu.citrix.com>, "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 19 May 2014 15:19:16 +0100
Message-ID: <537A2F040200007800013B6B@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 19.05.14 at 15:59, <george.dunlap@eu.citrix.com> wrote:
> On 05/19/2014 02:50 PM, Jan Beulich wrote:
>>>>> On 19.05.14 at 15:27, <George.Dunlap@eu.citrix.com> wrote:
>>> On Mon, May 19, 2014 at 8:48 AM, Zhang, Yang Z <yang.z.zhang@intel.com> wrote:
>>>> Because I just noticed that someone is asking when Intel will implement the
>>> VT-d page table separately. Actually, I am totally unaware it. The original
>>> issue that this patch tries to fix is the VRAM tracking which using the
>>> global log dirty mode. And I thought the best solution to fix it is in VRAM
>>> side not VT-d side. Because even use separate VT-d page table, we still
>>> cannot track the memory update from DMA. Even worse, I think two page tables
>>> introduce redundant code and maintain effort. So I wonder is it really
>>> necessary to implement the separate VT-d large page?
>>>
>>> Yes, it does introduce redundant code.  But unfortunately, IOMMU
>>> faults at the moment have to be considered rather risky; having on
>>> happens risks (in order of decreasing probability / increasing
>>> damage):
>>> * Device stops working for that VM until an FLR (losing a lot of its state)
>>> * The VM has to be killed
>>> * The device stops working until a host reboot
>>> * The host crashes
>>>
>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a video
>>> buffer isn't really robust enough.  I think that was Tim and Jan's
>>> primary reason for wanting the ability to have separate tables for HAP
>>> and IOMMU.
>>>
>>> Is that about right, Jan / Tim?
>> Yes, and not just "about" (perhaps with the exception that I think/
>> hope we don't have any lurking host crashes here).
> 
> I think the fear was that buggy hardware might cause a host crash / hang.

That's a valid concern indeed.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com>, Keir Fraser <keir@xen.org>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, George Dunlap <george.dunlap@eu.citrix.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 20 May 2014 03:09:10 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACDEA3@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Jan Beulich wrote on 2014-05-19:
>>>> On 19.05.14 at 09:48, <yang.z.zhang@intel.com> wrote:
>> Because I just noticed that someone is asking when Intel will
>> implement the VT-d page table separately. Actually, I am totally unaware it.
> 
> This was a request sent directly to you, so it shouldn't be a surprise.

Yes, but I am not buying in it since I think it not the right direction to fix this issue:

This is my original point:
Actually, the first solution came to my mind is B. Then I realized that even chose B, we still cannot track the memory updating from DMA(even with A/D bit, it still a problem). Also, considering the current usage case of log dirty in Xen(only vram tracking has problem), I though A is better.: Hypervisor only need to track the vram change. If a malicious guest try to DMA to vram range, it only crashed himself (This should be reasonable).


> 
>> The original
>> issue that this patch tries to fix is the VRAM tracking which using
>> the global log dirty mode. And I thought the best solution to fix it
>> is in VRAM side not VT-d side. Because even use separate VT-d page
>> table, we still cannot track the memory update from DMA.
> 
> Correct. But at least we can avoid IOMMU faults by not marking read-
> only the VRAM region. Unless the guest stores data in the VRAM region

It is easy to trigger an IOMMU faults by guest, like set an invalid DMA address. We cannot prevent it.

> that's not directly displayed information, but needs to be preserved
> across migration, the only downside of this would be temporary screen
> corruption in the guest immediately following migration. Clearly far
> better than eventually turning off passed through devices due to excessive IOMMU faults.
> 
>> Even worse, I think two page tables
>> introduce redundant code and maintain effort. So I wonder is it
>> really necessary to implement the separate VT-d large page?
> 
> Tim and I certainly think so. Andrew had a valid point in stating that
> guests without the need for VRAM dirty tracking, but with assigned PCI
> device(s) could still benefit from sharing page tables, so working
> towards a model where this could be retained would clearly be the optimal solution.

I still cannot understand what we want to achieve or fix by separate the page table.

> 
> I wonder whether you actually too the time to go through the old
> thread before writing your mail, since all you did is repeat your old
> arguments, without addressing any of the reasons given why we consider
> separate page tables better than shared ones.

I have gone through it many times. But I cannot find a strong reason to persuade me to do it. I think it is a 'nice to have' feature, not required. Unless if anyone can show me that current sharing mechanism do have problem and it only can be fixed by separate page table, then I think we must to do it. 

> 
> Jan


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: "Keir Fraser \(keir.xen@gmail.com\)" <keir.xen@gmail.com>, "Zhang, Xiantao" <xiantao.zhang@intel.com>, Jan Beulich <JBeulich@suse.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 20 May 2014 03:13:20 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACDED4@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

George Dunlap wrote on 2014-05-19:
>> 
>> Hi all
>> 
>> Sorry to turn out this old thread.
>> Because I just noticed that someone is asking when Intel will
>> implement the
> VT-d page table separately. Actually, I am totally unaware it. The
> original issue that this patch tries to fix is the VRAM tracking which
> using the global log dirty mode. And I thought the best solution to fix it is in VRAM side not VT-d side.
> Because even use separate VT-d page table, we still cannot track the
> memory update from DMA. Even worse, I think two page tables introduce
> redundant code and maintain effort. So I wonder is it really necessary
> to implement the separate VT-d large page?
> 
> Yes, it does introduce redundant code.  But unfortunately, IOMMU faults
> at the moment have to be considered rather risky; having on happens

You cannot prevent guest to trigger the IOMMU faults. It is easy to trigger an IOMMU fault by setting a invalid DMA address.

> risks (in order of decreasing probability / increasing damage): * Device
> stops working for that VM until an FLR (losing a lot of its state) * The
> VM has to be killed * The device stops working until a host reboot * The
> host crashes
> 
> Avoiding these by "hoping" that the guest OS doesn't DMA into a video
> buffer isn't really robust enough.  I think that was Tim and Jan's

Video buffer is only one case. How we can prevent the DMA to other reserved region?

> primary reason for wanting the ability to have separate tables for HAP and IOMMU.
> 
> Is that about right, Jan / Tim?
> 
>  -George


Best regards,
Yang


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, George Dunlap <george.dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Keir Fraser <keir@xen.org>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 20 May 2014 08:17:35 +0100
Message-ID: <537B1DAF0200007800013EAB@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 20.05.14 at 05:09, <yang.z.zhang@intel.com> wrote:
> Jan Beulich wrote on 2014-05-19:
>>>>> On 19.05.14 at 09:48, <yang.z.zhang@intel.com> wrote:
>>> Because I just noticed that someone is asking when Intel will
>>> implement the VT-d page table separately. Actually, I am totally unaware it.
>> 
>> This was a request sent directly to you, so it shouldn't be a surprise.
> 
> Yes, but I am not buying in it since I think it not the right direction to 
> fix this issue:
> 
> This is my original point:
> Actually, the first solution came to my mind is B. Then I realized that even 
> chose B, we still cannot track the memory updating from DMA(even with A/D 
> bit, it still a problem). Also, considering the current usage case of log 
> dirty in Xen(only vram tracking has problem), I though A is better.: 
> Hypervisor only need to track the vram change. If a malicious guest try to 
> DMA to vram range, it only crashed himself (This should be reasonable).

Except that the guest in no way needs to be malicious. I think you
forget that the ultimate goal of virtualization ought to be to make
guests behave the same (in terms of correctness, not generally in
terms of performance) as if run on real hardware, not matter what
they do. And DMA to VRAM wouldn't crash on real hardware (and I
can see legitimate uses of such).

>>> The original
>>> issue that this patch tries to fix is the VRAM tracking which using
>>> the global log dirty mode. And I thought the best solution to fix it
>>> is in VRAM side not VT-d side. Because even use separate VT-d page
>>> table, we still cannot track the memory update from DMA.
>> 
>> Correct. But at least we can avoid IOMMU faults by not marking read-
>> only the VRAM region. Unless the guest stores data in the VRAM region
> 
> It is easy to trigger an IOMMU faults by guest, like set an invalid DMA 
> address. We cannot prevent it.

Correct (and this btw also contradicts above spelled out goal: On
real hardware this would in the majority of cases also just cause
certain operations to become ineffectual rather than bring down
the system - we just have to be more rigid here to deal with
potential malicious guests).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 20 May 2014 08:20:18 +0100
Message-ID: <537B1E520200007800013EB7@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote:
> George Dunlap wrote on 2014-05-19:
>> Avoiding these by "hoping" that the guest OS doesn't DMA into a video
>> buffer isn't really robust enough.  I think that was Tim and Jan's
> 
> Video buffer is only one case. How we can prevent the DMA to other reserved 
> region?

You continue to neglect the difference: Accessing VRAM this way is
legitimate (and potentially useful). And - as just said in the other
reply - ideally we'd also simply ignore accesses to reserved regions
(and in fact we try to, by not immediately bringing down a guest
device doing such).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <George.Dunlap@eu.citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 20 May 2014 11:12:41 +0100
Message-ID: <CAFLBxZYgx10bwKz1fQ_2vTs2mE5DvxgBUwmbEE5k1toH84robQ@mail.gmail.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote:
>> George Dunlap wrote on 2014-05-19:
>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a video
>>> buffer isn't really robust enough.  I think that was Tim and Jan's
>>
>> Video buffer is only one case. How we can prevent the DMA to other reserved
>> region?
>
> You continue to neglect the difference: Accessing VRAM this way is
> legitimate (and potentially useful). And - as just said in the other
> reply - ideally we'd also simply ignore accesses to reserved regions
> (and in fact we try to, by not immediately bringing down a guest
> device doing such).

On the other hand, just to play devil's advocate here: Implementing
separate IOMMU tables (including superpages) isn't free; it has a
non-negligible cost, both in initial developer time, continuing
maintenance (code complexity, fixing bugs), extra memory at run-time,
&c.

Of all the things we could invest that developer time doing, why
should we make it possible to DMA into VRAM, rather than doing
something else?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "George Dunlap" <George.Dunlap@eu.citrix.com>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Tue, 20 May 2014 11:46:31 +0100
Message-ID: <537B4EA70200007800014085@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote:
> On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote:
>>> George Dunlap wrote on 2014-05-19:
>>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a video
>>>> buffer isn't really robust enough.  I think that was Tim and Jan's
>>>
>>> Video buffer is only one case. How we can prevent the DMA to other reserved
>>> region?
>>
>> You continue to neglect the difference: Accessing VRAM this way is
>> legitimate (and potentially useful). And - as just said in the other
>> reply - ideally we'd also simply ignore accesses to reserved regions
>> (and in fact we try to, by not immediately bringing down a guest
>> device doing such).
> 
> On the other hand, just to play devil's advocate here: Implementing
> separate IOMMU tables (including superpages) isn't free; it has a
> non-negligible cost, both in initial developer time, continuing
> maintenance (code complexity, fixing bugs), extra memory at run-time,
> &c.
> 
> Of all the things we could invest that developer time doing, why
> should we make it possible to DMA into VRAM, rather than doing
> something else?

While I agree that the question is valid, my position really is that it
was a mistake to implement the IOMMU code without superpage
support, i.e. I view this as a shortcoming independent of the VRAM
issue, and I would want to see this fixed rather sooner than later.
Had it been done properly from the beginning (like one would
expect for non-experimental code), a lot of this discussion could
have been avoided, and we wouldn't have had to take the
respective workaround close to the 4.4 release.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: George Dunlap <George.Dunlap@eu.citrix.com>, Jan Beulich <JBeulich@suse.com>
Cc: "Zhang, Xiantao" <xiantao.zhang@intel.com>, "Keir Fraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 21 May 2014 01:02:41 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACF345@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Jan Beulich wrote on 2014-05-20:
>>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote:
>> On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote:
>>>> George Dunlap wrote on 2014-05-19:
>>>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a
>>>>> video buffer isn't really robust enough.  I think that was Tim
>>>>> and Jan's
>>>> 
>>>> Video buffer is only one case. How we can prevent the DMA to other
>>>> reserved region?
>>> 
>>> You continue to neglect the difference: Accessing VRAM this way is
>>> legitimate (and potentially useful). And - as just said in the
>>> other reply - ideally we'd also simply ignore accesses to reserved

Can you give an example of what legitmate case you are mentioned twice(You mentioned it also in other reply)? I cannot understand why we need to restrict the CPU access to VRAM region but allow accessing from device. As I known, for gfx passthrough, both device and CPU are able to access them. And for emulated gfx, only software will access it which same as current we see in Xen.

>>> regions (and in fact we try to, by not immediately bringing down a
>>> guest device doing such).
>> 
>> On the other hand, just to play devil's advocate here: Implementing
>> separate IOMMU tables (including superpages) isn't free; it has a
>> non-negligible cost, both in initial developer time, continuing
>> maintenance (code complexity, fixing bugs), extra memory at
>> run-time, &c.
>> 
>> Of all the things we could invest that developer time doing, why
>> should we make it possible to DMA into VRAM, rather than doing
>> something else?
> 
> While I agree that the question is valid, my position really is that
> it was a mistake to implement the IOMMU code without superpage

We support the superpage via sharing EPT and VT-d pagetable.

> support, i.e. I view this as a shortcoming independent of the VRAM
> issue, and I would want to see this fixed rather sooner than later.
> Had it been done properly from the beginning (like one would expect
> for non-experimental code), a lot of this discussion could have been
> avoided, and we wouldn't have had to take the respective workaround
> close to the 4.4 release.

I still think the best solution is fixing the VRAM global log dirty mechanism which my previous patch already did. Because I cannot see any benefit with separating the page table.

> 
> Jan


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 21 May 2014 08:49:15 +0100
Message-ID: <537C769B02000078000145C2@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 21.05.14 at 03:02, <yang.z.zhang@intel.com> wrote:
> Jan Beulich wrote on 2014-05-20:
>>>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote:
>>> On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote:
>>>>> George Dunlap wrote on 2014-05-19:
>>>>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a
>>>>>> video buffer isn't really robust enough.  I think that was Tim
>>>>>> and Jan's
>>>>> 
>>>>> Video buffer is only one case. How we can prevent the DMA to other
>>>>> reserved region?
>>>> 
>>>> You continue to neglect the difference: Accessing VRAM this way is
>>>> legitimate (and potentially useful). And - as just said in the
>>>> other reply - ideally we'd also simply ignore accesses to reserved
> 
> Can you give an example of what legitmate case you are mentioned twice(You 
> mentioned it also in other reply)?

What's wrong with loading some graphics data right from an I/O device
(disk, network) into the frame buffer?

> I cannot understand why we need to 
> restrict the CPU access to VRAM region but allow accessing from device.

We'd love to also do this for device accesses, but to date IOMMU
fault are unrecoverable, and hence this is not an option.

> As I 
> known, for gfx passthrough, both device and CPU are able to access them. And 
> for emulated gfx, only software will access it which same as current we see 
> in Xen.

Why's that? I.e. why should a guest with emulated graphics not be
able to program a passed through device to access the VRAM directly?

>>>> regions (and in fact we try to, by not immediately bringing down a
>>>> guest device doing such).
>>> 
>>> On the other hand, just to play devil's advocate here: Implementing
>>> separate IOMMU tables (including superpages) isn't free; it has a
>>> non-negligible cost, both in initial developer time, continuing
>>> maintenance (code complexity, fixing bugs), extra memory at
>>> run-time, &c.
>>> 
>>> Of all the things we could invest that developer time doing, why
>>> should we make it possible to DMA into VRAM, rather than doing
>>> something else?
>> 
>> While I agree that the question is valid, my position really is that
>> it was a mistake to implement the IOMMU code without superpage
> 
> We support the superpage via sharing EPT and VT-d pagetable.

I.e. without EPT no superpages. Afaict there's no such limitation on
the AMD side.

>> support, i.e. I view this as a shortcoming independent of the VRAM
>> issue, and I would want to see this fixed rather sooner than later.
>> Had it been done properly from the beginning (like one would expect
>> for non-experimental code), a lot of this discussion could have been
>> avoided, and we wouldn't have had to take the respective workaround
>> close to the 4.4 release.
> 
> I still think the best solution is fixing the VRAM global log dirty 
> mechanism which my previous patch already did. Because I cannot see any 
> benefit with separating the page table.

You didn't in any way negate the condition of superpage support to
be added post-4.4 in order for your other change to go in: Neither
http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230.html
nor
http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236.html
have been responded to by you. By not doing so, to me at least you
implicitly accepted the condition. And by now refusing to meet it, you
basically tell us that we shouldn't be doing compromises like this with
you in the future.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "Nakajima, Jun" <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Zhang,  Xiantao" <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 21 May 2014 08:37:35 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AACF97A@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Jan Beulich wrote on 2014-05-21:
>>>> On 21.05.14 at 03:02, <yang.z.zhang@intel.com> wrote:
>> Jan Beulich wrote on 2014-05-20:
>>>>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote:
>>>> On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com>
> wrote:
>>>>>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote:
>>>>>> George Dunlap wrote on 2014-05-19:
>>>>>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a
>>>>>>> video buffer isn't really robust enough.  I think that was Tim
>>>>>>> and Jan's
>>>>>> 
>>>>>> Video buffer is only one case. How we can prevent the DMA to other
>>>>>> reserved region?
>>>>> 
>>>>> You continue to neglect the difference: Accessing VRAM this way is
>>>>> legitimate (and potentially useful). And - as just said in the
>>>>> other reply - ideally we'd also simply ignore accesses to reserved
>> 
>> Can you give an example of what legitmate case you are mentioned twice(You
>> mentioned it also in other reply)?
> 
> What's wrong with loading some graphics data right from an I/O device
> (disk, network) into the frame buffer?

Yes, it is ok. But we are talking about the DMA access to a 'readonly' buffer. It is totally a wrong usage model to me.

> 
>> I cannot understand why we need to
>> restrict the CPU access to VRAM region but allow accessing from device.
> 
> We'd love to also do this for device accesses, but to date IOMMU
> fault are unrecoverable, and hence this is not an option.
> 
>> As I known, for gfx passthrough, both device and CPU are able to access
>> them. And for emulated gfx, only software will access it which same as
>> current we see in Xen.
> 
> Why's that? I.e. why should a guest with emulated graphics not be
> able to program a passed through device to access the VRAM directly?

I think you confused the background of the issue: the essence of the issue is that we marked all pages as read-only and meanwhile we allow DMA to access those page. So the right solution is only mark the required page (in this case means VRAM region) as readonly and if guest still DMA to that region, then any unpredictable behavior to that guest is reasonable. What you said before totally ignore the 'read-only' thing. If there is no read-only operation, there will be no problem and no discussion around it.

> 
>>>>> regions (and in fact we try to, by not immediately bringing down a
>>>>> guest device doing such).
>>>> 
>>>> On the other hand, just to play devil's advocate here: Implementing
>>>> separate IOMMU tables (including superpages) isn't free; it has a
>>>> non-negligible cost, both in initial developer time, continuing
>>>> maintenance (code complexity, fixing bugs), extra memory at
>>>> run-time, &c.
>>>> 
>>>> Of all the things we could invest that developer time doing, why
>>>> should we make it possible to DMA into VRAM, rather than doing
>>>> something else?
>>> 
>>> While I agree that the question is valid, my position really is that
>>> it was a mistake to implement the IOMMU code without superpage
>> 
>> We support the superpage via sharing EPT and VT-d pagetable.
> 
> I.e. without EPT no superpages. Afaict there's no such limitation on
> the AMD side.

I don't think this is a 'limitation'. The original motivation to sharing page table is to use memory efficiently and easy for code maintain. And when we tried to add superpage support, we follow this way (we only need to change one copy of page table). But you think it is a limitation. I cannot understand.

> 
>>> support, i.e. I view this as a shortcoming independent of the VRAM
>>> issue, and I would want to see this fixed rather sooner than later.
>>> Had it been done properly from the beginning (like one would expect
>>> for non-experimental code), a lot of this discussion could have been
>>> avoided, and we wouldn't have had to take the respective workaround
>>> close to the 4.4 release.
>> 
>> I still think the best solution is fixing the VRAM global log dirty
>> mechanism which my previous patch already did. Because I cannot see any
>> benefit with separating the page table.
> 
> You didn't in any way negate the condition of superpage support to be
> added post-4.4 in order for your other change to go in: Neither
> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230.html
>  nor
> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236.html
>  have been responded to by you. By not doing so, to me at least you
> implicitly accepted the condition. And by now refusing to meet it, you
> basically tell us that we shouldn't be doing compromises like this with
> you in the future.

I have said before I am totally unware of it. And I know it only two days ago after someone told me. So please do not confuse this with the thing what we are discussing now. If you think I gave a promise implicitly at that time, I am sorry to let you think so.

As I said in previous thread, if we can prove that add hugepage for the separate VT-d page table is the only choice to solve problem, then now I am promising that I will do it ASAP. But till now, I didn't see any point that we must to have it. To me, it is still a nice to have feature.

> 
> Jan


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Wed, 21 May 2014 10:58:01 +0100
Message-ID: <537C94C902000078000146F5@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote:
> Jan Beulich wrote on 2014-05-21:
>>>>> On 21.05.14 at 03:02, <yang.z.zhang@intel.com> wrote:
>>> Jan Beulich wrote on 2014-05-20:
>>>>>>> On 20.05.14 at 12:12, <George.Dunlap@eu.citrix.com> wrote:
>>>>> On Tue, May 20, 2014 at 8:20 AM, Jan Beulich <JBeulich@suse.com>
>> wrote:
>>>>>>>>> On 20.05.14 at 05:13, <yang.z.zhang@intel.com> wrote:
>>>>>>> George Dunlap wrote on 2014-05-19:
>>>>>>>> Avoiding these by "hoping" that the guest OS doesn't DMA into a
>>>>>>>> video buffer isn't really robust enough.  I think that was Tim
>>>>>>>> and Jan's
>>>>>>> 
>>>>>>> Video buffer is only one case. How we can prevent the DMA to other
>>>>>>> reserved region?
>>>>>> 
>>>>>> You continue to neglect the difference: Accessing VRAM this way is
>>>>>> legitimate (and potentially useful). And - as just said in the
>>>>>> other reply - ideally we'd also simply ignore accesses to reserved
>>> 
>>> Can you give an example of what legitmate case you are mentioned twice(You
>>> mentioned it also in other reply)?
>> 
>> What's wrong with loading some graphics data right from an I/O device
>> (disk, network) into the frame buffer?
> 
> Yes, it is ok. But we are talking about the DMA access to a 'readonly' 
> buffer. It is totally a wrong usage model to me.

What read-only buffer are you referring to? From guest perspective,
nothing that gets marked read-only due to log-dirty handling really is
read-only. The fact that migration doesn't work due to this with an
assigned device can be excused by the handling of such an assigned
device not being implemented anyway. But the fact that behind the
scenes VRAM gets marked read-only should be (but isn't) entirely
transparent to the guest.

I'm not going to reply to the rest of your mail, both because I'm getting
the impression that we're moving in circles, and because I might become
impolite otherwise.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Fri, 23 May 2014 07:42:20 +0100
Message-ID: <537F09EC020000780001530D@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote:
> Jan Beulich wrote on 2014-05-21:
>> You didn't in any way negate the condition of superpage support to be
>> added post-4.4 in order for your other change to go in: Neither
>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230.html 
>>  nor
>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236.html 
>>  have been responded to by you. By not doing so, to me at least you
>> implicitly accepted the condition. And by now refusing to meet it, you
>> basically tell us that we shouldn't be doing compromises like this with
>> you in the future.
> 
> I have said before I am totally unware of it. And I know it only two days 
> ago after someone told me. So please do not confuse this with the thing what 
> we are discussing now. If you think I gave a promise implicitly at that time, 
> I am sorry to let you think so.
> 
> As I said in previous thread, if we can prove that add hugepage for the 
> separate VT-d page table is the only choice to solve problem, then now I am 
> promising that I will do it ASAP. But till now, I didn't see any point that 
> we must to have it. To me, it is still a nice to have feature.

Btw., I think I just spotted a second thing not working without split
page tables: mem-access (which doesn't and imo shouldn't depend
on !need_iommu(), other than mem-sharing and mem-paging)
likewise has the potential of creating entries resulting in IOMMU
faults.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "Nakajima, Jun" <jun.nakajima@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "Zhang,  Xiantao" <xiantao.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 26 May 2014 08:16:43 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E0AAD38E7@SHSMSX104.ccr.corp.intel.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Jan Beulich wrote on 2014-05-23:
>>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote:
>> Jan Beulich wrote on 2014-05-21:
>>> You didn't in any way negate the condition of superpage support to
>>> be added post-4.4 in order for your other change to go in: Neither
>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230.
>>> html
>>>  nor
>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236.
>>> html  have been responded to by you. By not doing so, to me at
>>> least you implicitly accepted the condition. And by now refusing to
>>> meet it, you basically tell us that we shouldn't be doing
>>> compromises like this with you in the future.
>> 
>> I have said before I am totally unware of it. And I know it only two
>> days ago after someone told me. So please do not confuse this with
>> the thing what we are discussing now. If you think I gave a promise
>> implicitly at that time, I am sorry to let you think so.
>> 
>> As I said in previous thread, if we can prove that add hugepage for
>> the separate VT-d page table is the only choice to solve problem,
>> then now I am promising that I will do it ASAP. But till now, I
>> didn't see any point that we must to have it. To me, it is still a nice to have feature.
> 
> Btw., I think I just spotted a second thing not working without split page tables:
> mem-access (which doesn't and imo shouldn't depend on !need_iommu(),
> other than mem-sharing and mem-paging) likewise has the potential of
> creating entries resulting in IOMMU faults.
> 

I don't know what mem-access is? Do you mean Xenaccess? If not, can you elaborate it or provide a link to help me to understand how it works?

> Jan


Best regards,
Yang



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Yang Z Zhang" <yang.z.zhang@intel.com>
Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 26 May 2014 10:04:57 +0100
Message-ID: <53831FD902000078000159F9@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote:
> Jan Beulich wrote on 2014-05-23:
>>>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote:
>>> Jan Beulich wrote on 2014-05-21:
>>>> You didn't in any way negate the condition of superpage support to
>>>> be added post-4.4 in order for your other change to go in: Neither
>>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230.
>>>> html
>>>>  nor
>>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236.
>>>> html  have been responded to by you. By not doing so, to me at
>>>> least you implicitly accepted the condition. And by now refusing to
>>>> meet it, you basically tell us that we shouldn't be doing
>>>> compromises like this with you in the future.
>>> 
>>> I have said before I am totally unware of it. And I know it only two
>>> days ago after someone told me. So please do not confuse this with
>>> the thing what we are discussing now. If you think I gave a promise
>>> implicitly at that time, I am sorry to let you think so.
>>> 
>>> As I said in previous thread, if we can prove that add hugepage for
>>> the separate VT-d page table is the only choice to solve problem,
>>> then now I am promising that I will do it ASAP. But till now, I
>>> didn't see any point that we must to have it. To me, it is still a nice to 
> have feature.
>> 
>> Btw., I think I just spotted a second thing not working without split page 
> tables:
>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(),
>> other than mem-sharing and mem-paging) likewise has the potential of
>> creating entries resulting in IOMMU faults.
>> 
> 
> I don't know what mem-access is? Do you mean Xenaccess? If not, can you 
> elaborate it or provide a link to help me to understand how it works?

The (example) tool indeed is named xen-access. See XENMEM_access_op
(used to be HVMOP_{get,set}_mem_access).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Nakajima, Jun" <jun.nakajima@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Fri, 30 May 2014 18:26:18 -0700
Message-ID: <CAL54oT3edU8dE+gGSbaPerkuhdgTchW4n=j3+syzuc9xEGxmcQ@mail.gmail.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote:
>> Jan Beulich wrote on 2014-05-23:
>>>>>> On 21.05.14 at 10:37, <yang.z.zhang@intel.com> wrote:
>>>> Jan Beulich wrote on 2014-05-21:
>>>>> You didn't in any way negate the condition of superpage support to
>>>>> be added post-4.4 in order for your other change to go in: Neither
>>>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01230.
>>>>> html
>>>>>  nor
>>>>> http://lists.xenproject.org/archives/html/xen-devel/2014-02/msg01236.
>>>>> html  have been responded to by you. By not doing so, to me at
>>>>> least you implicitly accepted the condition. And by now refusing to
>>>>> meet it, you basically tell us that we shouldn't be doing
>>>>> compromises like this with you in the future.
>>>>
>>>> I have said before I am totally unware of it. And I know it only two
>>>> days ago after someone told me. So please do not confuse this with
>>>> the thing what we are discussing now. If you think I gave a promise
>>>> implicitly at that time, I am sorry to let you think so.
>>>>
>>>> As I said in previous thread, if we can prove that add hugepage for
>>>> the separate VT-d page table is the only choice to solve problem,
>>>> then now I am promising that I will do it ASAP. But till now, I
>>>> didn't see any point that we must to have it. To me, it is still a nice to
>> have feature.
>>>
>>> Btw., I think I just spotted a second thing not working without split page
>> tables:
>>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(),
>>> other than mem-sharing and mem-paging) likewise has the potential of
>>> creating entries resulting in IOMMU faults.
>>>
>>
>> I don't know what mem-access is? Do you mean Xenaccess? If not, can you
>> elaborate it or provide a link to help me to understand how it works?
>
> The (example) tool indeed is named xen-access. See XENMEM_access_op
> (used to be HVMOP_{get,set}_mem_access).
>

The tool xen-access is located in tools/tests, and I think that this
is used mostly by developers who know what they are doing.
If we had separate VT-d page tables, they might observe confusing
results; even if they write-protect pages, somebody (i.e. I/O devices)
modifies those pages.
To me, observing IOMMU faults seems consistent with the consequence of
changes to guest memory permission.

-- 
Jun
Intel Open Source Technology Center

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "Jun Nakajima" <jun.nakajima@intel.com>
Cc: Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 02 Jun 2014 07:55:00 +0100
Message-ID: <538C3BE40200007800016AE1@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 31.05.14 at 03:26, <jun.nakajima@intel.com> wrote:
> On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote:
>>> Jan Beulich wrote on 2014-05-23:
>>>> Btw., I think I just spotted a second thing not working without split page
>>> tables:
>>>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(),
>>>> other than mem-sharing and mem-paging) likewise has the potential of
>>>> creating entries resulting in IOMMU faults.
>>>>
>>>
>>> I don't know what mem-access is? Do you mean Xenaccess? If not, can you
>>> elaborate it or provide a link to help me to understand how it works?
>>
>> The (example) tool indeed is named xen-access. See XENMEM_access_op
>> (used to be HVMOP_{get,set}_mem_access).
>>
> 
> The tool xen-access is located in tools/tests, and I think that this
> is used mostly by developers who know what they are doing.

The tool is, indeed. But the underlying feature clearly isn't limited
to or solely intended for developers.

> If we had separate VT-d page tables, they might observe confusing
> results; even if they write-protect pages, somebody (i.e. I/O devices)
> modifies those pages.
> To me, observing IOMMU faults seems consistent with the consequence of
> changes to guest memory permission.

And I would agree if these faults were restartable. You're certainly
aware that a not too large amount of faults within a reasonably short
period of time will lead to the device being turned off, with quite likely
fatal consequences to the guest.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <George.Dunlap@eu.citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 2 Jun 2014 15:06:17 +0100
Message-ID: <CAFLBxZYUp_Kws0-M9wgoyKvM6G_SwD1+DGwTECAK0M5-_h1OTQ@mail.gmail.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On Mon, Jun 2, 2014 at 7:55 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 31.05.14 at 03:26, <jun.nakajima@intel.com> wrote:
>> On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote:
>>>> Jan Beulich wrote on 2014-05-23:
>>>>> Btw., I think I just spotted a second thing not working without split page
>>>> tables:
>>>>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(),
>>>>> other than mem-sharing and mem-paging) likewise has the potential of
>>>>> creating entries resulting in IOMMU faults.
>>>>>
>>>>
>>>> I don't know what mem-access is? Do you mean Xenaccess? If not, can you
>>>> elaborate it or provide a link to help me to understand how it works?
>>>
>>> The (example) tool indeed is named xen-access. See XENMEM_access_op
>>> (used to be HVMOP_{get,set}_mem_access).
>>>
>>
>> The tool xen-access is located in tools/tests, and I think that this
>> is used mostly by developers who know what they are doing.
>
> The tool is, indeed. But the underlying feature clearly isn't limited
> to or solely intended for developers.
>
>> If we had separate VT-d page tables, they might observe confusing
>> results; even if they write-protect pages, somebody (i.e. I/O devices)
>> modifies those pages.
>> To me, observing IOMMU faults seems consistent with the consequence of
>> changes to guest memory permission.
>
> And I would agree if these faults were restartable. You're certainly
> aware that a not too large amount of faults within a reasonably short
> period of time will lead to the device being turned off, with quite likely
> fatal consequences to the guest.

Sure -- but there are a number of features (PoD, paging, page sharing,
even migration) which are incompatible with pass-through, and the user
is simply not allowed to use them together.  Why not just add this one
to the list?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: "Jan Beulich" <JBeulich@suse.com>
To: "George Dunlap" <George.Dunlap@eu.citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Xiantao Zhang <xiantao.zhang@intel.com>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, Tim Deegan <tim@xen.org>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 02 Jun 2014 15:27:20 +0100
Message-ID: <538CA5E80200007800016E34@mail.emea.novell.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

>>> On 02.06.14 at 16:06, <George.Dunlap@eu.citrix.com> wrote:
> On Mon, Jun 2, 2014 at 7:55 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 31.05.14 at 03:26, <jun.nakajima@intel.com> wrote:
>>> On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote:
>>>>> Jan Beulich wrote on 2014-05-23:
>>>>>> Btw., I think I just spotted a second thing not working without split page
>>>>> tables:
>>>>>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(),
>>>>>> other than mem-sharing and mem-paging) likewise has the potential of
>>>>>> creating entries resulting in IOMMU faults.
>>>>>>
>>>>>
>>>>> I don't know what mem-access is? Do you mean Xenaccess? If not, can you
>>>>> elaborate it or provide a link to help me to understand how it works?
>>>>
>>>> The (example) tool indeed is named xen-access. See XENMEM_access_op
>>>> (used to be HVMOP_{get,set}_mem_access).
>>>>
>>>
>>> The tool xen-access is located in tools/tests, and I think that this
>>> is used mostly by developers who know what they are doing.
>>
>> The tool is, indeed. But the underlying feature clearly isn't limited
>> to or solely intended for developers.
>>
>>> If we had separate VT-d page tables, they might observe confusing
>>> results; even if they write-protect pages, somebody (i.e. I/O devices)
>>> modifies those pages.
>>> To me, observing IOMMU faults seems consistent with the consequence of
>>> changes to guest memory permission.
>>
>> And I would agree if these faults were restartable. You're certainly
>> aware that a not too large amount of faults within a reasonably short
>> period of time will lead to the device being turned off, with quite likely
>> fatal consequences to the guest.
> 
> Sure -- but there are a number of features (PoD, paging, page sharing,
> even migration) which are incompatible with pass-through, and the user
> is simply not allowed to use them together.  Why not just add this one
> to the list?

Considering that mem-access came at about the same time or even
later than mem-paging and mem-sharing, I was silently assuming that
it wasn't enforcing the same restrictions as the other two for a reason.
Maybe it was just forgotten...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Xiantao Zhang <xiantao.zhang@intel.com>, Yang Z Zhang <yang.z.zhang@intel.com>, "KeirFraser\(keir.xen@gmail.com\)" <keir.xen@gmail.com>, Tim Deegan <tim@xen.org>, "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jun Nakajima <jun.nakajima@intel.com>
Subject: Re: [Xen-devel] [PATCH] Don't track all memory when enabling log dirty to track vram
Date: Mon, 2 Jun 2014 16:03:20 +0100
Message-ID: <538C9238.5080701@eu.citrix.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On 06/02/2014 03:27 PM, Jan Beulich wrote:
>>>> On 02.06.14 at 16:06, <George.Dunlap@eu.citrix.com> wrote:
>> On Mon, Jun 2, 2014 at 7:55 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 31.05.14 at 03:26, <jun.nakajima@intel.com> wrote:
>>>> On Mon, May 26, 2014 at 2:04 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> On 26.05.14 at 10:16, <yang.z.zhang@intel.com> wrote:
>>>>>> Jan Beulich wrote on 2014-05-23:
>>>>>>> Btw., I think I just spotted a second thing not working without split page
>>>>>> tables:
>>>>>>> mem-access (which doesn't and imo shouldn't depend on !need_iommu(),
>>>>>>> other than mem-sharing and mem-paging) likewise has the potential of
>>>>>>> creating entries resulting in IOMMU faults.
>>>>>>>
>>>>>> I don't know what mem-access is? Do you mean Xenaccess? If not, can you
>>>>>> elaborate it or provide a link to help me to understand how it works?
>>>>> The (example) tool indeed is named xen-access. See XENMEM_access_op
>>>>> (used to be HVMOP_{get,set}_mem_access).
>>>>>
>>>> The tool xen-access is located in tools/tests, and I think that this
>>>> is used mostly by developers who know what they are doing.
>>> The tool is, indeed. But the underlying feature clearly isn't limited
>>> to or solely intended for developers.
>>>
>>>> If we had separate VT-d page tables, they might observe confusing
>>>> results; even if they write-protect pages, somebody (i.e. I/O devices)
>>>> modifies those pages.
>>>> To me, observing IOMMU faults seems consistent with the consequence of
>>>> changes to guest memory permission.
>>> And I would agree if these faults were restartable. You're certainly
>>> aware that a not too large amount of faults within a reasonably short
>>> period of time will lead to the device being turned off, with quite likely
>>> fatal consequences to the guest.
>> Sure -- but there are a number of features (PoD, paging, page sharing,
>> even migration) which are incompatible with pass-through, and the user
>> is simply not allowed to use them together.  Why not just add this one
>> to the list?
> Considering that mem-access came at about the same time or even
> later than mem-paging and mem-sharing, I was silently assuming that
> it wasn't enforcing the same restrictions as the other two for a reason.
> Maybe it was just forgotten...

Well one can certainly imagine someone wanting to run it in that mode, 
and "knowing" that it would be safe because they were going to be 
careful not going to use memaccess on pages on which a DMA was going to 
happen.  I think in that situation, a developer would probably want to 
know that their assumption was wrong immediately (by having the IOMMU 
fault perhaps crash the domain), than have to figure out that their 
assumption was wrong by having unexplainable results.

If memaccess + passthru is enabled by default at the moment, there may 
be an argument for disabling it by default, perhaps with an override for 
those who want the ability to shoot themselves in the foot.  But I don't 
think that silent violations of the memaccess "promise" is better than 
just having the IOMMU faults; so I don't think that's an argument for 
implementing non-shared IOMMU superpages.

  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel